Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading upstream tarball doesn't respect package.proxy array #1642

Closed
Beanow opened this issue Jan 5, 2020 · 12 comments · Fixed by #1644
Closed

Loading upstream tarball doesn't respect package.proxy array #1642

Beanow opened this issue Jan 5, 2020 · 12 comments · Fixed by #1644

Comments

@Beanow
Copy link
Contributor

Beanow commented Jan 5, 2020

Describe the bug

Having multiple uplinks point to the same registry, the package.proxy array gets ignored.
Instead we'll walk through the uplinks and select the last one in the list that can serve the URL we've found for our upstream tarball.

To Reproduce

storage: ./storage

uplinks:
  cachednpm:
    url: https://registry.npmjs.org/
    cache: true
  npmjs:
    url: https://registry.npmjs.org/
    cache: false

packages:
  '**':
    access: $all
    publish:
    proxy: cachednpm
  1. Run with above config (listening on localhost)
  2. Try requesting http://localhost:4873/@verdaccio/streams/-/streams-8.5.2.tgz
  3. Notice that ./storage/@verdaccio/streams/streams-8.5.2.tgz does not exist

The uplink chosen here was npmjs, not cachednpm.

Expected behavior

Making the upstream request should respect the packages.proxy array.
In this example, selecting cachednpm for it's uplink, causing it to cache the tarball.

Environment information

Verdaccio version: 4.4.1
System:
    OS: Linux 5.0 Ubuntu 18.04.3 LTS (Bionic Beaver)
Binaries:
    Node: 12.13.1 - /usr/local/bin/node
    Yarn: 1.21.1 - /usr/local/bin/yarn
    npm: 6.13.4 - /usr/local/bin/npm

Additional context

I believe the culprit here is:

for (const uplinkId in self.uplinks) {
if (self.uplinks[uplinkId].isUplinkValid(file.url)) {
uplink = self.uplinks[uplinkId];
}
}

The isUplinkValid check will be true for both uplinks. So the last one in the array will be selected.
Perhaps it should be using hasProxyTo from src/lib/config-utils.ts?

@sergiohgz
Copy link
Member

sergiohgz commented Jan 5, 2020

In the config above, every package (denoted by '**') are requested from cachednpm uplink, so it will be retrieved from cache.
Can you provide Verdaccio trace or try with a specific config?

@Beanow
Copy link
Contributor Author

Beanow commented Jan 5, 2020

In the config above, every package (denoted by '**' are requested from cachednpm uplink, so it will be retrieved from cache.

That's what I would expect. But it doesn't 😄

@nothingismagick
Copy link

nothingismagick commented Jan 5, 2020

Its the for loop - it doesn't exit at the first resolution, like you mentioned @Beanow - maybe the "workaround" is to have the one you really want to be last in the array?

@sergiohgz
Copy link
Member

Try setting a specific package (like 'react') to be taken from cachednpm and other without cache. Then, paste results and check if the storage contains cached package.

@Beanow
Copy link
Contributor Author

Beanow commented Jan 5, 2020

I've hacked in a log (in build/lib/storage.js), just after the for-loop I linked.

console.log('DECIDED ON UPLINK:', uplink.upname);

This outputs DECIDED ON UPLINK: npmjs
Notice I've not added npmjs as proxy for any package at all.

@Beanow
Copy link
Contributor Author

Beanow commented Jan 5, 2020

Did you mean something like this @sergiohgz?

uplinks:
  cachednpm:
    url: https://registry.npmjs.org/
    cache: true
  npmjs:
    url: https://registry.npmjs.org/
    cache: false

packages:
  'statuses':
    access: $all
    proxy: cachednpm
  '**':
    access: $all
    proxy: npmjs
  1. Request http://localhost:4873/@verdaccio/streams/-/streams-8.5.2.tgz

    DECIDED ON UPLINK: npmjs

  2. Request http://localhost:4873/statuses/-/statuses-1.5.0.tgz

    DECIDED ON UPLINK: npmjs

$ tree ./storage
./storage
├── statuses
│   └── package.json
└── @verdaccio
    └── streams
        └── package.json

@sergiohgz
Copy link
Member

I cannot reproduce the issue, let me investigate deeper, it's working as expected.

@Beanow
Copy link
Contributor Author

Beanow commented Jan 5, 2020

@nothingismagick

Its the for loop - it doesn't exit at the first resolution, like you mentioned @Beanow - maybe the "workaround" is to have the one you really want to be last in the array?

It will always select the last one with a matching URL, so it won't let you use different settings for different packages that way.

@sergiohgz
Copy link
Member

Well, I understand your thoughts, and that's correct, the loop is not intended to stop if it find several urls that matches the uplink selected, so it will get latest uplink. I'm going to make a PR with the fix soon 😉

On the other hand, we didn't evaluate the need for setting the same URL for different uplink definitions. Why would you want to do that? (pure curiosity 😃)

@Beanow
Copy link
Contributor Author

Beanow commented Jan 5, 2020

Well, as the example shows, the uplink definition includes some configuration like cache: boolean.
I would like to control those for different packages.

For example, some should cache while others shouldn't.

The particular use-case I'm thinking of, is to provide a redundant host for our own packages when you're hitting a firewall for the major registries. To preserve disk-space it should proxy (but not cache) '**', but should cache our project's packages.

@Beanow
Copy link
Contributor Author

Beanow commented Jan 5, 2020

-        if (self.uplinks[uplinkId].isUplinkValid(file.url)) {
+        if (hasProxyTo(name, uplinkId, self.config.packages)) {

This change works for me. Though should probably add a unit test to prove.

juanpicado added a commit that referenced this issue Jan 11, 2020
* test: different uplinks with the same URL

This test reproduces #1642

* fix: use hasProxyTo to find correct uplink for tarballs

Fixes #1642

Co-authored-by: Juan Picado @jotadeveloper <juanpicado19@gmail.com>
@lock
Copy link

lock bot commented Apr 15, 2020

🤖This thread has been automatically locked 🔒 since there has not been any recent activity after it was closed.
We lock tickets after 90 days with the idea to encourage you to open a ticket with new fresh data and to provide you better feedback 🤝and better visibility 👀.
If you consider, you can attach this ticket 📨 to the new one as a reference for better context.
Thanks for being a part of the Verdaccio community! 💘

@lock lock bot added the outdated label Apr 15, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Apr 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
4 participants