Skip to content

Clean up build-time API calls to Open Processing #831

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
clairep94 opened this issue May 4, 2025 · 21 comments
Open

Clean up build-time API calls to Open Processing #831

clairep94 opened this issue May 4, 2025 · 21 comments
Assignees
Labels
Bug Something isn't working

Comments

@clairep94
Copy link
Contributor

clairep94 commented May 4, 2025

Cleanup post-resolution of failed builds (edited on May 14th):

Following the patch-fix to resolve builds hitting OP rate limits and discussions around updates to the api/sketch/[id] and api/curation/sketches, there are a few follow-up tasks for final cleanup:

  1. OP: Update the data returned for the following endpoints in:
// `/api/sketch/:id`
fullname: string
// `/api/curation/:curationId/sketches`
licence: string
  1. p5.js-website: Clean up getSketch to remove license: '' from below:
//https://github.com/processing/p5.js-website/blob/main/src/api/OpenProcessing.ts#L90-L94
if (memoizedSketch) {
      return {
        ...memoizedSketch,
        license: "", // take this out
      } as OpenProcessingSketchResponse;
    }
  1. p5.js-website: Decouple SketchLayout and SketchesLayout by removing authorName from SketchLayout props
// intended change on https://github.com/processing/p5.js-website/blob/main/src/layouts/SketchLayout.astro#L18-L24

interface Props {
  sketchId: string;
}

const { sketchId } = Astro.props;
const { title, createdOn, instructions, fullname: authorName } = await getSketch(sketchId);
  1. p5.js-website: Update any types described on https://github.com/processing/p5.js-website/blob/main/src/api/OpenProcessing.ts

Updated issue (last edited May 5th):

After further investigating the original issue below, it seems that the source of PRs failing test in the past week is that when the static site build is happening, and the project is trying to fetch sketch-data from OpenProcessing API, it is hitting 429 Too Many Requests

I've added error logging on data-fetching functions on PR #832, which still fail as expected, but show us the 429 status and error message below.

Image

I have created a patch-fix solution on my fork here, which passes tests/build but we can see in the call-stack that after generating a certain amount of pages for individual sketches, we hit the limit on calls to OpenProcessing API again (please see PR details for screenshots & proposed solutions).

This leads to me believe that the solution may be to copy this issue to the OpenProcessing API repo, which seems to be private, and see if the rate-limit can be raised by 25 hits/10 minutes when being hit for site-builds. I have also proposed a second solution on clairep94#2, which could greatly reduce the amount of hits to OpenProcessing API that a build requires, but which would require some refactoring on the website & update to the API spec for getCurationSketches

Keen to hear everyone's thoughts!


Original Issue:

Potential bug with builds failing locally and on GHA when generating pages for some sketches

Most appropriate sections of the p5.js website?

Community (Sketches, Libraries, Events)

What is your operating system?

Mac OS

Web browser and version

Not an issue on the production site, but with build on GHA & locally

Actual Behavior

I noticed that some PR's created in the last week have been failing on a similar error to below, and the changes in the PRs seem unrelated to the errors.

Image

I tried to build from main locally and encountered the same error, but with varying sketch ids.

I then tried to make a test PR with a nominal change (added a period in some homepage text), and encountered the same error, so potentially there may be something with the metadata of this sketch 2225254, or another source that is having a side-effect of this sketch page failing on build.

Affected PR Failing Sketch ID & Stack Trace
#820 2225254
#821 2225254
#824 2225254
#826 2225254
#828 2225254
#830 (Test PR) 2225254

Expected Behavior

Build should not fail unless there is a breaking change on the branch
Test failures should vary between PRs given they are working on different issues

Steps to reproduce

  • Pull from main & run npm run build locally, or make PR into main
  • Might be good for someone to try this locally a few times to confirm if it is the same sketch ID that is failing each time, or if it varies.

Would you like to work on the issue?

Happy to look into it further or have it be assigned

@clairep94 clairep94 added the Bug Something isn't working label May 4, 2025
@clairep94
Copy link
Contributor Author

clairep94 commented May 4, 2025

Hi @ksen0, tagging you for visibility! Let me know what you think

@clairep94 clairep94 changed the title Potential bug with builds failing locally and on GHA when generating pages for some sketches Builds are getting error:429 Too Many Requests responses from OpenProcessing API, resulting in recent series of failed tests on pull requests May 4, 2025
@ksen0
Copy link
Member

ksen0 commented May 7, 2025

Hey thanks @clairep94 ! Looking into it, will report back.

Edited to add: besides asking to change OP rate-limit, we can also potentially consider splitting build step so that fetching from OP API is not required for every single PR, but rather only when building main or dev-2.0

@msawired
Copy link

msawired commented May 8, 2025

Hello 👋,
@ksen0 let me know about this issue. Yes, there are rateLimits on the API (currently 40 requests/min). I also looked at the code little, and I can see that multiple calls are made to 3 endpoints:

  • api/curation/sketches
  • api/sketch/[id]
  • api/sketch/[id]/code

Looking at how the data is being used, I think it can be optimized to just use api/curation/sketches endpoint, as it contains all the data you need (including sketch and user metadata). I understand that p5js.org/sketches/[id] endpoint uses /api/sketch endpoint. To optimize this, you can lookup the initial curation/sketches data to find sketch metadata instead of making a new call to the API.

Not sure where "sketch/code" is used, as you are using iframes instead of rebuilding the whole sketch. Note that, this /code endpoint will require more strict authentication soon, to prevent webscraping efforts, etc. Possibly Bear Token.

Otherwise, I can also add specific IPs to whitelist to prevent rateLimiting, however that's not my preference, as some wrong code may hog the OP server and we wouldn't want that 😄...

@ksen0
Copy link
Member

ksen0 commented May 8, 2025

Thank you @msawired ! I agree that solutions that do not require changing rate limit should be fully investigated first and looks like removing the use of api/sketch/[id]/code and api/sketch/[id] (and rather storing / accessing original api/curation/sketches metadata) is both possible and would fix the current problem.

Potentially we can do temporary whitelisting to unblock merging, but I do think changing rate limit even temporarily can create more problems by accident, and I don't see critical PRs in the pipeline. So if this can be resolved n a few days, I think that could be alright.

@clairep94 what do you think / do you want to work on the fix?

@clairep94
Copy link
Contributor Author

clairep94 commented May 8, 2025

Hi both happy to work on the fix for the refactor! Will probably have some time this weekend

I think we might need to update api/curation/sketches to return some extra data that api/sketches/[id] was returning:

// from https://openprocessing.org/api/curation/87649/sketches
[
//...
{
    "visualID": "1957050",
    "title": "Generative Succulents",
    "description": "Generative succulents for the #WCCChallenge",
    "parentID": null,
    "thumbnailUpdatedOn": "2023-08-08 04:50:27",
    "videoUpdatedOn": null,
    "userID": "52944",
    "fullname": "newyellow",
    "membershipType": "1",
    "submittedOn": "2024-04-08 03:10:47",
    "status": "1"
  },
//...
]
// from the expected response for /api/sketch/[id]
// see https://github.com/processing/p5.js-website/blob/main/src/api/OpenProcessing.ts#L55-L68

export type OpenProcessingSketchResponse = {
  /** Sketch ID used for constructing URLs */
  visualID: string;
  /** Title of sketch */
  title: string;
  /** Description of sketch */
  description: string;
  instructions: string;
  license: string;
  userID: string;
  submittedOn: string;
  createdOn: string;
  mode: string;
};

@msawired would it be possible to add the following properties to items returned by the curation endpoint?

  instructions: string;
  createdOn: string;
  mode: string;

@ksen0
Copy link
Member

ksen0 commented May 9, 2025

@clairep94 thank you for working on this!

That's a good point that some information is missing. Do you think a temporary fix would be possible of e.g., building in a pause to stay under 40 requests/min? Or if removing /code one actually already resolves the immediate problem?

Though it would be good if the /sketches endpoint provided all info, I think this is critical enough that it would be good to patch our failing tests without having to wait for any changes in OP's API even if they are possible (and then, once the patch is merged to both main and 2.0 branches, coordinate on the more complete solution).

What do you think?

@msawired
Copy link

msawired commented May 9, 2025

yes I can add those data points.
Are these API hits ultimately done on the frontend or on the backend?
If on the frontend, note that large networks such as schools may hit the rate limit if students visit the website at the same time, since they all come from the same IP, and whitelisting your servers wouldn't help. So, it is always good practice to keep the hits as minimum as needed.

@clairep94
Copy link
Contributor Author

@ksen0

  • I'll take a look to see if we can remove the /code hits & any other API-call optimisations we can do will solve the issue, then see if adding the pauses is necessary!

@msawired

  • I believe the site is server-side generated, so the API hits are happening from the "backend" in the sense that the astro server is hitting the API, and creating static pages based on those results
  • I am not the most experienced with SSG, so this might not be 100% correct -- @ksen0 would this be accurate?

Will have something for review likely tomorrow, thanks both :)

@limzykenneth
Copy link
Member

The requests are happening on build so I guess we can consider it "backend"? It will run on each build and that includes all CI build step on each PR. It doesn't run on the built site served to users as the requested data are already built into the site.

@msawired One thing I'm thinking, not sure if there already is something like this already is a bulk request endpoint, so that we can pass a list of sketch ID and OP's API can return all the required data all in one round of request. That way we reduce the number of overall requests and it might be easier on your end for db queries as well if you can group that into one query too. We don't have to do it for this round of fix and can implement this later.

@clairep94 I think you all have a good handle on this and I'm just chiming in to confirm the above, but do ping me whenever if you need any clarifications or help.

@clairep94
Copy link
Contributor Author

clairep94 commented May 10, 2025

Hi @ksen0 @msawired @limzykenneth

I have opened a PR to resolve the pipeline block here:
#838

I looked into removing the call to /code, and saw that it was being called in the function getSketchSize (function here).
It seems like the functions returns a parsed width and height from the code payload if the sketch mode is p5js, then uses them to create a scaling iframe if they are available (codeblock)

If width and height are hard to add, perhaps we can just use the defaulted normal iframe here?

Otherwise, if width and height are easily parsed and added in the backend, perhaps we could add it to the sketch metadata?

@msawired sorry to bother again!
Would it be possible to add the following properties to the following endpoints?

// `/api/sketch/:id`
width: number // if easy/possible
height: number // if easy/possible
fullname: string // missing from curation but in sketch, just to align both endpoints
// `/api/curation/:curationId/sketches`
width: number // if easy/possible
height: number // if easy/possible
licence: string // missing from curation but in sketch, I don't think it's being displayed on the site currently but just to align!

The current PR 838 should unblock merging any open PRs, but I'll make a followup PR to do the optimisation to eliminate the call to /code if the new properties can be available on the other endpoints or if we decide to use normal iframes to display all sketches!

  • I think following this final followup PR, we would end up with a single call to /api/curation/:curationId/sketches to build all sketch pages

@msawired
Copy link

I don't extract width/height so no information is available for that. I may recommend to set the iframe full width and height fixed (or as tall as possible, such as 80vh). The iframe should then center the sketch if possible, smaller than windowWidth/windowHeight.
I can add the license info next week.

@clairep94
Copy link
Contributor Author

@msawired
That makes sense for the width/height thanks! Maybe we just use the default iframe in that case?

Would it be possible to also add fullname to /api/sketch/:id? These would just allow for /api/sketch/:id and /api/curation/:curationId/sketches to be fully aligned

// `/api/sketch/:id`
fullname: string
// `/api/curation/:curationId/sketches`
licence: string

Really appreciate your responsiveness! This is my first open source contribution so I appreciate the feedback

@davepagurek
Copy link
Collaborator

Maybe we just use the default iframe in that case?

One of the reasons why we detect the width/height is so we can change how scaling works on those sketches. On mobile, a fixed sized sketch like 800x600 will require scrolling within the iframe to see the whole thing. If we detect those cases though, we make a fixed-size iframe that we scale up or down with CSS to make sure mobile viewers still see the whole thing by default.

I also think the width/height probably won't ever be part of the API because it's just a heuristic right now -- to do it properly we'd have to actually evaluate the code. There's possibly a world where at runtime, the sketch communicates the size info up to its parent window, but that also isn't super easy to do, so our current solution is to try to catch simple cases via code parsing.

Anyway, using the default iframe would mean compromising on the mobile experience, but if there aren't other good ways to work around API limits then we can consider it to unblock builds.

@ksen0 ksen0 closed this as completed in 13318ef May 12, 2025
@clairep94
Copy link
Contributor Author

Ah ok, in that case maybe the changes on the PR as-is is sufficient?

Builds will still make the call to /code to extract the height and length, but we've optimised calls to /sketch enough to unblock the builds

@ksen0 ksen0 reopened this May 12, 2025
@ksen0
Copy link
Member

ksen0 commented May 12, 2025

Thank you for the fix @clairep94 ! I just merged it into main, looks great 💯 Do you have time to make the equivalent patch for the 2.0 branch please? They've diverged, but cherrypicking your commits to that branch may work for this case! If you don't have capacity let me know and I'll go ahead and do it.

@ksen0
Copy link
Member

ksen0 commented May 12, 2025

PS:

This is my first open source contribution

Great work! Please do feel free to add yourself to the p5.js all-contributors

@msawired
Copy link

🙌 great to hear all is resolved. Do you still need license and fullname on the endpoints above?
In the meantime, I thought of implementing BEAR token and bypassing the rate limit for the BEAR token. Since you are compiling the pages static on the backend, you can use this token privately to access the API with higher rate limit. This is not ready on OP yet, but let me know if you still need it and I can prioritize.

@clairep94
Copy link
Contributor Author

Thanks @ksen0! Happy to do the path for 2.0 too, I should have some time in the eve today (BST).

@msawired license and fullname on the respective endpoints above would be great if you have time!

For the patch, since we don't have license from the curation response, but it's not displayed on the SketchLayout, I just put the below, but it would be nice to clean that up:

if (memoizedSketch) {
      return {
        ...memoizedSketch,
        license: "", // take this out
      } as OpenProcessingSketchResponse;
    }

For fullname, currently SketchLayout takes a authorName prop which uses fullname from when the build gets all curation sketches to create the SketchesLayout, but it would be nice to decouple and have SketchLayout take sketchId as the only prop in the future.

// intended change on https://github.com/processing/p5.js-website/blob/main/src/layouts/SketchLayout.astro#L18-L24

interface Props {
  sketchId: string;
}

const { sketchId } = Astro.props;
const { title, createdOn, instructions, fullname: authorName } = await getSketch(sketchId);

@ksen0
Copy link
Member

ksen0 commented May 13, 2025

Hi @clairep94 ! I happen to have a moment now and wanted to unblock a couple of PRs on 2.0, so I went ahead and copied the patch here: #842

(This was done by creating a new branch under 2.0, then running git cherry-pick 03093c4 9d48a44 43b62aa 8a7bd6c 092e049 63bc0a9 cebf48a 76d4202 7c87f32 fa5e551 4fc8788 and then git push)

Please feel free to keep this issue around for the remaining improvements to OP re: license and full name. @msawired no rush on adding those (since the immediate patch is done), but seems like it would be helpful for a cleanup here. @clairep94 please feel free to update the name of this issue to be more descriptive of this cleanup of OP API usage?

Regarding bearer auth: @msawired this would be great long-term! I think it's farther outside scope of this issue though. @clairep94 do you mind creating a new issue please, for updating website to use bear token when that has been implemented by @msawired ? If that sounds reasonable to you both.

Big thanks to everyone on the thread for finding and resolving this issue!

@msawired
Copy link

Sure, I will post it back here once the token implementation is live!

@clairep94 clairep94 changed the title Builds are getting error:429 Too Many Requests responses from OpenProcessing API, resulting in recent series of failed tests on pull requests Clean up build-time API calls to Open Processing May 14, 2025
@clairep94
Copy link
Contributor Author

Hi @ksen0 and @msawired, I've updated the current issue with the small clean-up tasks, and created the new issue for updating to use the bearer-token here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants