-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(changelog): get from jsDelivr filelist if possible #640
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks clear now! maybe some tests where there's "almost-right" files in jsDelivr that shouldn't be marked as resolved so that doesn't regress?
Not sure I got that, can you explain? |
Co-authored-by: Haroen Viaene <hello@haroen.me>
The call is not as fast as I hoped, I mean it's seems fast in the browser but it's probably because it's cached. The average is ~4s. Maybe @jimaek has some input on this? the function is here: npm-search/src/jsDelivr/index.ts Lines 77 to 102 in fbe2e97
|
@bodinsamuel if you request listings for every package, that seems about right (for now). There are two cache layers:
However, if we don't have the listing stored yet (the very first request for that package/version), the response time is in seconds. I'm assuming that if you call this for every single package, you're going to end up with lots of DB misses at the beginning. |
Makes sense, and since we will probably re-request at each version increment we will almost always cache miss ahah. Thanks for the answer |
when we re-request the same package, won't it be for a different version, and thus a cache miss? I'm not sure what the alternative is though, maybe unpkg can be an option, if it has a higher cache rate? or requesting and untarring in advance to make the process non-blocking? |
unpkg will have the same issue, it will have to download the npm and cache it. So first request for a new version will also be a miss and pretty slow. |
@MartinKolarik maybe we could somehow speed things up on our side? |
Looking at our DB, we have > 2.5M package versions cached (1.2M unique packages) but there might be some other things slowing this down. At least in some cases, I see that the npm API that we use for resolving versions was unusually slow to respond and took most of the overall processing time. I'm going to check this further but in the meantime I'd also ask you to add a |
Good point, adding it in #646 |
@bodinsamuel we didn't make any changes yet but I see an average response time of 1300 ms for the past 24 hours for your requests. Do you see a similar improvement in your metrics? |
I deployed an optimized version at around 5 PM UTC, hopefully, you'll see some additional improvement. |
we don't observe that sharp drop unfortunately, that's weird. Unrelated to that I have noticed a few other things:
|
That one is corrupt, try to download and open it: https://registry.npmjs.org/gital/-/gital-0.0.0.tgz
That may indeed happen sometimes if you request it too soon due to the various caches involved. Btw, I'm also seeing an increase in request throughput, which corresponds with the per-request latency drop. |
Yes I have changed our infra and parallelism to process way more packages. |
# 1.0.0 (2021-07-19) ### Bug Fixes * 1.0.1 ([#655](#655)) ([5c2cb7f](5c2cb7f)) * add expiresAt field ([#643](#643)) ([dba5d2a](dba5d2a)) * add new worker to bootstrap ([#636](#636)) ([ebbe3df](ebbe3df)) * cache dns ([#654](#654)) ([e80d437](e80d437)) * cache total downloads ([#653](#653)) ([99be307](99be307)) * deprecated facets should be boolean ([#638](#638)) ([19d30d0](19d30d0)) * docker build ([#651](#651)) ([947058d](947058d)) * expiresAt can be a numericFilter ([#664](#664)) ([e89fd14](e89fd14)) * improve logging + remove catchup ([#647](#647)) ([cbc545d](cbc545d)) * increase mem + round downloadRatio ([#644](#644)) ([8ef8425](8ef8425)) * mini fixes ([#659](#659)) ([d34bcc1](d34bcc1)) * setup circleci ([#593](#593)) ([4472405](4472405)) * stop using unpkg ([#658](#658)) ([aae2d86](aae2d86)) * throw outside try ([#661](#661)) ([d36a77a](d36a77a)) * typo ([#637](#637)) ([94851af](94851af)) * up semantic release ([#667](#667)) ([94d8d6c](94d8d6c)) * various ([#663](#663)) ([18fea1e](18fea1e)) * **algolia:** missing config param ([#387](#387)) ([d25ea19](d25ea19)) * **alternative names:** remove prismjs -> prismjs.js ([a1bad34](a1bad34)) * **deps:** update dependency @sentry/node to v5.10.2 ([9c445b0](9c445b0)) * **deps:** update dependency @sentry/node to v5.11.0 ([a858954](a858954)) * **deps:** update dependency @sentry/node to v5.12.4 ([efd6140](efd6140)) * **deps:** update dependency @sentry/node to v5.15.4 ([965fffb](965fffb)) * **deps:** update dependency @sentry/node to v5.15.5 ([89f234e](89f234e)) * **deps:** update dependency @sentry/node to v5.17.0 ([3563f6d](3563f6d)) * **deps:** update dependency @sentry/node to v5.19.1 ([394cb8c](394cb8c)) * **deps:** update dependency @sentry/node to v5.30.0 ([56421c5](56421c5)) * **deps:** update dependency @sentry/node to v5.6.2 ([667e12f](667e12f)) * **deps:** update dependency @sentry/node to v5.7.0 ([55b410d](55b410d)) * **deps:** update dependency @sentry/node to v5.7.1 ([bec31ba](bec31ba)) * **deps:** update dependency @sentry/node to v5.9.0 ([6599c79](6599c79)) * **deps:** update dependency algoliasearch to v3.34.0 ([11f49b6](11f49b6)) * **deps:** update dependency algoliasearch to v3.35.0 ([c4faa7a](c4faa7a)) * **deps:** update dependency algoliasearch to v3.35.1 ([837ba44](837ba44)) * **deps:** update dependency algoliasearch to v4.9.3 ([#628](#628)) ([78e3617](78e3617)) * **deps:** update dependency async to v2.6.3 ([4a9cf53](4a9cf53)) * **deps:** update dependency async to v3.2.0 ([3aa436e](3aa436e)) * **deps:** update dependency bunyan to v1.8.15 ([912e7bc](912e7bc)) * **deps:** update dependency dotenv to v8.1.0 ([b785e8f](b785e8f)) * **deps:** update dependency dotenv to v8.2.0 ([ad5f3fb](ad5f3fb)) * **deps:** update dependency dtrace-provider to v0.8.8 ([4879231](4879231)) * **deps:** update dependency gravatar-url to v3.1.0 ([f66b8ee](f66b8ee)) * **deps:** update dependency hot-shots to v6.4.1 ([f84aa5f](f84aa5f)) * **deps:** update dependency hot-shots to v6.5.1 ([2bdeb8e](2bdeb8e)) * **deps:** update dependency hot-shots to v6.8.1 ([1a58429](1a58429)) * **deps:** update dependency hot-shots to v6.8.2 ([a09e193](a09e193)) * **deps:** update dependency hot-shots to v6.8.5 ([871e2e5](871e2e5)) * **deps:** update dependency hot-shots to v6.8.7 ([fc61f4b](fc61f4b)) * **deps:** update dependency lodash to v4.17.13 [security] ([ad8a7ea](ad8a7ea)) * **deps:** update dependency lodash to v4.17.14 ([10e1777](10e1777)) * **deps:** update dependency lodash to v4.17.15 ([a0f2d0d](a0f2d0d)) * **deps:** update dependency lodash to v4.17.19 [security] ([38bd4e0](38bd4e0)) * **deps:** update dependency lodash to v4.17.21 ([baf7442](baf7442)) * **deps:** update dependency ms to v2.1.3 ([b4f0289](b4f0289)) * **deps:** update dependency nano to v8.2.2 ([a4befee](a4befee)) * **deps:** update dependency nano to v8.2.3 ([2c2272c](2c2272c)) * **deps:** update dependency nice-package to v3.1.2 ([55d8953](55d8953)) * **deps:** update dependency object-sizeof to v1.5.1 ([33296d3](33296d3)) * **deps:** update dependency object-sizeof to v1.5.2 ([eeb434a](eeb434a)) * **deps:** update dependency object-sizeof to v1.6.0 ([715f2f6](715f2f6)) * **deps:** update dependency object-sizeof to v1.6.1 ([24945f3](24945f3)) * **dev:** upgrade env ([#592](#592)) ([3c66c56](3c66c56)) * **dev:** upgrade env /2 ([#595](#595)) ([a86cd71](a86cd71)) * **formatPkg:** remove non-existing versions ([c37d6d6](c37d6d6)), closes [#534](#534) * **package.json:** add repo url ([#649](#649)) ([6b248b5](6b248b5)) * empty change ([#405](#405)) ([475e366](475e366)) * id of null ([#406](#406)) ([8e5fb1d](8e5fb1d)) * kill process regurlarly, for cache and bootstrap ([#412](#412)) ([9c778b2](9c778b2)) * **esm:** avoid errors, slightly deal with arrays ([f5eefa9](f5eefa9)) * **formatPkg:** cleaned main can be an array ([#395](#395)) ([7ef7f2f](7ef7f2f)) * **getFilesList:** call using package object ([6b954d5](6b954d5)) * **jsdelivr:** fetch just npm hits ([#375](#375)) ([25d29dd](25d29dd)), closes [#371](#371) * **lint:** correct setup to require extension ([#381](#381)) ([29afbd5](29afbd5)) * **saveDocs:** filter out wrong docs more robustly ([bc81351](bc81351)) * **size:** more exact truncating of readme ([#559](#559)) ([f6187c1](f6187c1)) * **ts:** main can be array ([b619daa](b619daa)) * **TS:** infer definitions correctly ([#357](#357)) ([143aa06](143aa06)) * **TS:** pass correct object ([cdf334b](cdf334b)) * **TS:** support scoped packages ([#364](#364)) ([655e86a](655e86a)) * **unpkg:** remove json flag + add unit test ([#392](#392)) ([d706694](d706694)) * import correctly got ([bb11884](bb11884)) * multiple small bugs after [#379](#379) ([#380](#380)) ([0580052](0580052)) * **config:** fully correct objectIDs ([b25fd81](b25fd81)) * **config:** use allowed chars for objectID ([34f41bb](34f41bb)) * **deps:** update dependency algoliasearch to v3.27.0 ([6c87eed](6c87eed)) * **deps:** update dependency algoliasearch to v3.27.1 ([0985d20](0985d20)) * **deps:** update dependency algoliasearch to v3.28.0 ([d48ad9c](d48ad9c)) * **deps:** update dependency algoliasearch to v3.29.0 ([d6057d5](d6057d5)) * **deps:** update dependency algoliasearch to v3.30.0 ([1a571ad](1a571ad)) * **deps:** update dependency algoliasearch to v3.31.0 ([5448c89](5448c89)) * **deps:** update dependency algoliasearch to v3.32.0 ([f52c1a8](f52c1a8)) * **deps:** update dependency algoliasearch to v3.32.1 ([c93f30f](c93f30f)) * **deps:** update dependency algoliasearch to v3.33.0 ([e26d4d9](e26d4d9)) * **deps:** update dependency async to v2.6.2 ([f9a9cb3](f9a9cb3)) * **deps:** update dependency babel-preset-env to v1.7.0 ([9081d2d](9081d2d)) * **deps:** update dependency bunyan-debug-stream to v1.1.0 ([f3c9d7e](f3c9d7e)) * **deps:** update dependency bunyan-debug-stream to v1.1.1 ([deccb8b](deccb8b)) * **deps:** update dependency dotenv to v6 ([#213](#213)) ([1b40279](1b40279)) * **deps:** update dependency dotenv to v6.1.0 ([0c8cc10](0c8cc10)) * **deps:** update dependency dotenv to v6.2.0 ([a54c1eb](a54c1eb)) * **deps:** update dependency got to v8.3.1 ([2376f53](2376f53)) * **deps:** update dependency got to v8.3.2 ([fcf2550](fcf2550)) * **deps:** update dependency hosted-git-info to v2.7.1 ([751b0af](751b0af)) * **deps:** update dependency lodash to v4.17.10 ([075a877](075a877)) * **deps:** update dependency lodash to v4.17.11 ([e49680a](e49680a)) * **deps:** update dependency ms to v2.1.2 ([cb207be](cb207be)) * **deps:** update dependency nice-package to v3.0.4 ([7a2b490](7a2b490)) * **deps:** update dependency nice-package to v3.1.0 ([361d409](361d409)) * **deps:** update dependency object-sizeof to v1.3.0 ([976f0fd](976f0fd)) * **deps:** update dependency object-sizeof to v1.3.1 ([fe25f6a](fe25f6a)) * **deps:** update dependency object-sizeof to v1.4.0 ([ad57ee8](ad57ee8)) * **formatPkg:** correct name ([b8175f3](b8175f3)) * **formatPkg:** don't discard packages without author, but with owners[] ([da66fb9](da66fb9)) * **npm:** allow undefined downloads ([a0d9c5a](a0d9c5a)) * **npm:** catch errors ([483c0c4](483c0c4)) * **stage:** push correct stage to statemanager ([00b0571](00b0571)) * **ts:** no double slashes ([dd84f88](dd84f88)) * **unpkg:** catch errors ([4efcd01](4efcd01)) * set settings on bootstrap when we start ([e35c0d1](e35c0d1)) * wait for deletion to happen beore continuing ([0734436](0734436)) * **bootstrap:** move to production only in bootstrap ([#126](#126)) ([b26dce6](b26dce6)) * **changelog:** add defaults to catch errors properly ([91e6ebd](91e6ebd)) * **changelog:** fall back to master if the gitHead is undefined ([52fe6ff](52fe6ff)) * **changelogs:** guard for null and undefined ([0a0a748](0a0a748)) * **computed:** use the cleaned package to match keys ([44a839c](44a839c)) * **deletes:** handle npm deletions ([1ad5025](1ad5025)) * **dependedUpon:** encode start and en keys ([24c5fe9](24c5fe9)) * **deps:** pin dependencies ([d1c1377](d1c1377)) * **deps:** update dependency algoliasearch to v3.24.11 ([e8a61bc](e8a61bc)) * **deps:** update dependency algoliasearch to v3.24.12 ([cea8a73](cea8a73)) * **deps:** update dependency algoliasearch to v3.25.1 ([7457f4e](7457f4e)) * **deps:** update dependency algoliasearch to v3.26.0 ([6fde846](6fde846)) * **deps:** update dependency dotenv to v5.0.0 ([#107](#107)) ([e972e19](e972e19)) * **deps:** update dependency dotenv to v5.0.1 ([acc314c](acc314c)) * **deps:** update dependency got to v8.0.3 ([2717b36](2717b36)) * **deps:** update dependency got to v8.2.0 ([64c2318](64c2318)) * **deps:** update dependency got to v8.3.0 ([19efaf8](19efaf8)) * **deps:** update dependency hosted-git-info to v2.6.0 ([0091297](0091297)) * **deps:** update dependency lodash to v4.17.5 ([d07ad04](d07ad04)) * **downloads:** be resilient for 404 or downloads endpoint for a chunk ([866fbcf](866fbcf)) * **downloads:** filter out scoped packages ([76f571a](76f571a)), closes [#36](#36) * **downloads:** set default of 0 ([d157450](d157450)) * **formatPkg:** rewrite get info into separate functions ([92706ce](92706ce)) * **gitHead:** fix bad backward compat ([dc34d24](dc34d24)) * **gitHead:** fix bad backward compat ([195108f](195108f)) * **gitHead:** put back gitHead ([92d373d](92d373d)), closes [#53](#53) [#64](#64) * **memleak:** in watch mode, do not use promise chain ([3f2e860](3f2e860)) * **memleak:** maybe fix it ([711b830](711b830)) * **memory:** don't keep a reference of the `chain` in watch ([f973913](f973913)) * **merge:** bad merge from me ([359f498](359f498)) * **schema:** backwards-compatible ([1b24b21](1b24b21)) * **settings:** put synonyms and rules in the configure file ([#128](#128)) ([af8e709](af8e709)), closes [#123](#123) * **stateManager:** don't assume starting at "zzz" ([c79e6cd](c79e6cd)) * **timeouts:** increase pouch timeout ([#174](#174)) ([a9ccb77](a9ccb77)) * **url:** try to fix url for good ([7da9daf](7da9daf)) * **watch:** add missing return ([30f6e43](30f6e43)) * **watch:** avoid memleak by not piling up docs ([#130](#130)) ([4522ee5](4522ee5)) ### Features * add health API ([#650](#650)) ([95587a3](95587a3)) * add methods to process a single package ([#652](#652)) ([a3c41f3](a3c41f3)) * prepare docker ([#648](#648)) ([21b5d02](21b5d02)) * process package in queue instead of batch ([#656](#656)) ([c4f2aa2](c4f2aa2)) * **babel:** add a forced keyword to babel plugins ([440f344](440f344)) * **changelog:** add changes variations ([e5ce4dc](e5ce4dc)) * **changelog:** detect /changelog.markdown ([bcf21a1](bcf21a1)) * **changelog:** get from jsDelivr filelist if possible ([#640](#640)) ([dd386d2](dd386d2)) * **data:** add "bin" ([446d212](446d212)) * **data:** add "versions" attribute ([766a9c3](766a9c3)) * **data:** add concatenated name ([72ab12e](72ab12e)), closes [#33](#33) * **data:** add flagging of type=module ([#386](#386)) ([7cd0765](7cd0765)) * **data:** add jsDelivr hits ([#263](#263)) ([adff89d](adff89d)) * **deprecated:** add the attribute for faceting ([#160](#160)) ([afe02c8](afe02c8)), closes [#159](#159) * **devDeps:** add devDependencies ([01058ef](01058ef)) * **faceting:** allow searching in keywords and owner ([8dd2cda](8dd2cda)) * **formatPkg:** add .js to alternative names ([#383](#383)) ([8463308](8463308)), closes [#217](#217) * **jsDelivr:** move code, add tests, preload data correctly ([#384](#384)) ([373d341](373d341)) * **keywords:** add webpack-scaffold ([#296](#296)) ([d4e57a7](d4e57a7)) * **npm:** Include directory details from repository objects ([#320](#320)) ([ccb1766](ccb1766)) * **process:** redo bootstrap after X amount of time ([a79d999](a79d999)), closes [#20](#20) * **quality:** add a flag for very low quality packages ([314cafb](314cafb)) * **query rules:** add filtering on attr:value ([#221](#221)) ([ebcbf56](ebcbf56)) * **ranking:** do tie breaking based on the magnitude of downloads ([#178](#178)) ([85b631f](85b631f)) * **relevance:** add some synonyms ([#192](#192)) ([760f34a](760f34a)) * **relevance:** enable alternative names query rule ([#195](#195)) ([01217e8](01217e8)), closes [#194](#194) * **relevance:** put name, description and eywords on same level ([#188](#188)) ([ee62193](ee62193)) * **relevance:** use jsDelivr hits for ranking ([#269](#269)) ([9039f76](9039f76)) * **relevancy:** add deprecated in account when sorting ([0b2add3](0b2add3)) * **requests:** add user-agent and httpsAgent ([#646](#646)) ([5a48ad3](5a48ad3)) * **schema:** move git head into githubRepo ([5cbf4e4](5cbf4e4)) * **tracking:** save which stage is currently activated ([dbb7b98](dbb7b98)) * **ts:** allow faceting ([e19e0b0](e19e0b0)) * **ts:** use jsdelivr to check for d.ts ([#645](#645)) ([fbe2e97](fbe2e97)) * **typescript:** pre-load definitely typed pkg ([#639](#639)) ([3968726](3968726)) * add Sentry ([#390](#390)) ([8c08fd5](8c08fd5)) * experimental modules compat ([4f31ab3](4f31ab3)) * full TS migration ([#626](#626)) ([fddc2a8](fddc2a8)) * refacto (part 2) ([#396](#396)) ([2df582b](2df582b)) * **sentry:** wait for the right amount of time. ([#391](#391)) ([d2f00e2](d2f00e2)) * move algolia ([#385](#385)) ([e5d7bec](e5d7bec)) * refacto (part 1) ([#371](#371)) ([c024451](c024451)) * upgrade packages ([#374](#374)) ([3c70053](3c70053)) * **relevance:** merge all the query rules ([#194](#194)) ([9a24fcc](9a24fcc)) * **settings:** allow to make a PR which changes both the settings and the data ([#179](#179)) ([e8f7c2a](e8f7c2a)) * **tags:** add `tags` to the schema ([57a476e](57a476e)) * **third-party:** add handling of Angular CLI schematics, and rework registry subset ([#169](#169)) ([bfab179](bfab179)) * **vue-cli:** add a forced keyword to vue-cli plugins ([3d6ed42](3d6ed42)) * **yeoman:** Identify yeoman generators through computedKeywords ([#181](#181)) ([08c81af](08c81af)) * Add repository info ([#101](#101)) ([29f6fa0](29f6fa0)) ### Reverts * Revert "Revert "chore(deps): update babel monorepo to v7.6.2"" ([4cf094e](4cf094e)) * Revert "Revert "chore(deps): update dependency lint-staged to v9.4.0"" ([11bd8d6](11bd8d6))
getChangelogs is currently the slowest operation and probably the one with the most impact on the network, because it queries N files in parallel multipled by X packages.
To help this: try to fetch file list from jsDelivr to check Changelogs presence.
If it matches we should gain 4s avg (up to 8s), if it does not matches (can happen that changelog are not published along with the code) we rollback to the brut-force method.