Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Deploying on Swarm with "Nexus" gives "No such image" #748

Closed
RAKedz opened this issue Jul 6, 2018 · 26 comments
Closed

Question: Deploying on Swarm with "Nexus" gives "No such image" #748

RAKedz opened this issue Jul 6, 2018 · 26 comments

Comments

@RAKedz
Copy link

RAKedz commented Jul 6, 2018

When issuing the command:

faas deploy -f ./stack.yml --network saturn --send-registry-auth –update

The service created for the function keeps spewing the message:

    "Status": {
        "Timestamp": "2018-07-04T21:44:17.077754447Z",
        "State": "rejected",
        "Message": "preparing",
        "Err": "No such image: myregistry.com:28833/func-nmap:latest",
        "ContainerStatus": {
            "ContainerID": "",
            "PID": 0,
            "ExitCode": 0
        },

We have also tried this command since we have a private registry (Nexus):

curl -u "myuser:mypass" -XPOST https://myopenfaas.com/system/functions -d '{ "service": "func-nmap", "image": "myregistry.com:28833/func-nmap", "envProcess": "xargs nmap", "network": "mynetwork", "registryAuth": 'mydockerloginfornexuspass' }'

Expected Behaviour

The service should be finding the image and successfully running func-nmap.

Current Behaviour

The commands above indicate a success, but the service will eventually get rejected with an error message that the image doesn’t exist.

I also noticed it takes about 10 mins for the function to show in OpenFaas website and it will take the same amount of time when you delete it from the website. If you try to remove it using the client it will indicate it doesn’t exist, even though the OpenFaas website will still show it along with the client list command. Not sure why there is a long duration between the website and the issuing of the command.

Possible Solution

N/A

Steps to Reproduce (for bugs)

We already have a platform and wanted to include OpenFaas into it. Our platform is on Digital Ocean using Ubuntu 16.04 servers. We have a private DNS, a registered domain, a frontend proxy using Nginx. Nginx doesn’t run in a container, it has own server. The website uses basic auth and Let’s Encrypt. Nginx does a proxy to the backend which runs the applications using Docker Swarm. UFW is used on all servers. The Docker Swarm is using an encrypted network on the private network. We have a CI/CD built and we use our own private registry (Nexus) , Gitlab and Jenkins. It’s been fully cooked in for about 3 years.

As I mentioned we decided to add OpenFaas into the platform.

I followed the instructions here to acquire the source code and install it into our existing Swarm with a few minor tweaks.

http://docs.openfaas.com/deployment/docker-swarm/

  1. Edit the docker-compose.yml
  2. Replace the default network with our existing network
  3. Change the ports for gateway and Prometheus from:
    a. 8080:8080 to 8180:8080
    b. 9090:9090 to 9190:9090
  4. cd into prometheus and edit prometheus.yml
  5. Change targets: ['localhost:9090'] to targets: ['localhost:9190']

Now execute the stack as described in the link and check the Swarm to validate.

To add OpenFaas and Prometheus to our website we need to create two A records to our current domain and create a SAN with Let’s Encrypt and open up two ports 8180 and 9190 to the firewall using UFW.

We edited the nginx.conf to create the upstreams for OpenFaas and Prometheus.

upstream openfaas {
least_conn;
server app03:8180;
server app04:8180;
server app05:8180;
server app06:8180;
server app01:8180;
}

upstream prometheus {
least_conn;
server app03:9190;
server app04:9190;
server app05:9190;
server app06:9190;
server app01:9190;
}

We created two sites .conf for each one.

  1. OpenFaaS Contents

server {
listen 80;
listen [::]:80;

server_name www.myopenfaas.com mypenfaas.com;

location / {
    return 301 https:myopenfaas.com$request_uri;

}

location ^~ /.well-known {
   auth_basic off;
   default_type "text/plain";
   root /var/www/openfaas;
   allow all;

}
}

server {

access_log /var/log/nginx/access_stream_openfaas.log upstream_time;
listen 443 ssl;
server_name myopenfaas.com;
root /var/www/openfaas;
ssl on;
include snippets/ssl-mywebsite.com.conf;
include snippets/ssl-params.conf;
include snippets/proxy-openfaas.com.conf;
client_max_body_size 1G;

location / {
include snippets/cors-openfaas.com.conf;
proxy_cache backendcache;
proxy_cache_bypass $http_cache_control;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
add_header X-Proxy-Cache $upstream_cache_status;
proxy_pass http://openfaas;
}
}

  1. Prometheus Contents

server {
listen 80;
listen [::]:80;

server_name www.myprometheus.com myprometheus.com;

location / {
   return 301 https:myprometheus.com$request_uri;

}

location ^~ /.well-known {
   auth_basic off;
   default_type "text/plain";
   root /var/www/prometheus;
   allow all;

}
}

server {

access_log /var/log/nginx/access_stream_prometheus.log upstream_time;
listen 443 ssl;
server_name myprometheus.com;
root /var/www/prometheus;
ssl on;
include snippets/ssl-mywebsite.com.conf;
include snippets/ssl-params.conf;
include snippets/proxy-prometheus.com.conf;
client_max_body_size 1G;

location / {
include snippets/cors-prometheus.com.conf;
proxy_cache backendcache;
proxy_cache_bypass $http_cache_control;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
add_header X-Proxy-Cache $upstream_cache_status;
proxy_pass http://prometheus;
}

}
Both websites displayed as expected. Tested the OpenFaas functions from the website and with curl which had no issues.

Now to create a simple example function from this website with nmap.

https://blog.alexellis.io/cli-functions-with-openfaas/

I installed the faas-cli on my local Macintosh and on one of the Ubuntu servers. Both types gave the same results.

I tweaked the Dockerfile and used a stack.yml.

  1. Dockerfile Contents

FROM alpine:3.7

ADD https://github.com/openfaas/faas/releases/download/0.8.2/fwatchdog /usr/bin
RUN chmod +x /usr/bin/fwatchdog
RUN mkdir -p /home/app

RUN apk --no-cache add curl
&& echo "Pulling watchdog binary from Github."
&& curl -sSL https://github.com/openfaas/faas/releases/download/0.8.2/fwatchdog > /usr/bin/fwatchdog
&& chmod +x /usr/bin/fwatchdog
&& cp /usr/bin/fwatchdog /home/app
&& apk del curl --no-cache

RUN apk add --no-cache nmap
RUN addgroup -S app && adduser -S -g app app
RUN chown app /home/app

WORKDIR /home/app

USER app
ENV fprocess="xargs nmap"
ENV write_debug="false"

HEALTHCHECK --interval=5s CMD [ -e /tmp/.lock ] || exit 1

CMD [ "fwatchdog" ]

  1. Stack.yml Contents

provider:
name: faas
gateway: https://myopenfaas.com

functions:
func-nmap:
lang: dockerfile
skip_build: true
handler: ./func-nmap
image: myregistry.com:28833/func-nmap
fprocess: "xargs nmap"
environment:
read_timeout: 60
write_timeout: 60
constraints:
- "node.platform.os == linux"

I used Docker and faas-cli to build and deploy the image. The issues comes when you try to publish as I explained in the first part.

Context

We are working on a PoC and have personally met Alex at a Kubernetes meetup in Austin 2017 and my partner Lyndon met up with Alex again in San Francisco during the DockerCon 2018. He talked with Alex about our PoC and he was interested in it. Our goal is to lower development costs amongst other features.

Your Environment

  • Docker 18.03.0-ce
  • Docker Swarm
  • Operating System and version Ubuntu 16.04 and MacOS High Sierra
  • Use Latest Nginx.
  • Use Latest Nexus for private registry.

Thanks,

Rebecca

@alexellis
Copy link
Member

Hi Rebecca,

I met Lyndon and he told me you were using Kubernetes in production and Swarm in development.

I haven't used Nexus, do you know whether three is a free/open-source version I can find to set-up and reproduce your issue?

Have you tried pushing an image to the Docker Hub, setting it to "private" and then deploying that too?

As far as I am aware the private repo work was tested both with the Docker Hub and the "Docker open-source registry" in the "registry:latest" image. I suspect there may be a configuration issue somewhere in your environment.

Is "myregistry.com:28833" a valid adress? What happens if you try to run this from your laptop?

i.e.

docker run -ti myregistry.com:28833/func-nmap:latest sh

The correct flag is: --send-registry-auth, but you do not need the -update flag.

I wonder if you could set up a DigitalOcean droplet with OpenFaaS and a private Nexus registry without any of the other changes to reproduce the issue in isolation? Perhaps you can add my ssh key there if you still have issues and we can try to collaborate with you.

Please also email alex@openfaas.com for a Slack invitation for you both.

Alex

@alexellis alexellis changed the title Customized Environment - Unable to Find Image During faas Deploy Question: Deploying on Swarm with "Nexus" gives "No such image" Jul 6, 2018
@RAKedz
Copy link
Author

RAKedz commented Jul 9, 2018

Hi Alex,

I couldn't attend the DockerCon in San Francisco, I really wanted to go. Glad you met Lyndon.

For this ticket I changed out the websites because this was public and wanted to give the details to show what we tweaked and if perhaps we left something off or didn't configure something correctly.

Nexus comes in open-source. We use Nginx to proxy Nexus to get to the docker repos. It works perfect for our current swarm, I just can't figure out why OpenFaaS wasn't working.

https://www.sonatype.com/nexus-repository-oss

We are not using Kubernetes. When we started our PoC ~3 years ago there wasn't much with Kubernetes and when I did look into it the documentation was very poor. Docker had better documentation and we started Swarm when it was in beta. Now that Digital Ocean will be having Kubernetes available in July (what I read), I would like to try it out and compare and maybe switch over.

I used the --send-registry-auth as shown in the beginning and I also used the curl. Not sure if I need to use both.

(myregistry.com is not the real name since this is public. It does work.)

docker run -ti myregistry.com:28833/func-nmap:latest sh

latest: Pulling from func-nmap
ff3a5c916c92: Already exists
3a6181dd0caa: Pull complete
d9ec83484db7: Pull complete
a92d4615a34f: Pull complete
59a3e1ea8c6b: Pull complete
80c582fcd57b: Pull complete
70c97e07bc9d: Pull complete
bad954cd7a0c: Pull complete
Digest: sha256:d9ca4382ca7f1255f812824942e8b1c43f75759b52f7e1632ae3da6ad4fec3c0
Status: Downloaded newer image for myregistry.com:28833/func-nmap:latest

I will email you to get the Slack invitation and go from there on how to approach our environment.

Thanks,

Rebecca

@RAKedz
Copy link
Author

RAKedz commented Jul 9, 2018 via email

@alexellis
Copy link
Member

Rebecca please set up a minimal example on the 1 or 2gb DigitalOcean droplet with Nexus and share the credentials with us over email. We can then try setting up OpenFaaS to pull from there and debug it.

Alternatively you could create a user in your live or staging nexus.

Alex

@RAKedz
Copy link
Author

RAKedz commented Jul 9, 2018

Hi Alex or Support,

I am curious on how the command:

faas deploy -f ./stack.yml --network mynetwork --send-registry-auth

is passing the docker user and password to the Swarm in order to pull the image?

When we created our private registry using Nexus we had to do:

docker login myregistry.com:28833

and it saves the values into ~.docker\config.json

Then we include this when creating the service:

--with-registry-auth

I came across this link https://github.com/openfaas/faas/blob/master/docs/managing-images.md and tried the curl as I demonstrated in my first post but that didn't seem to work, though it didn't complain about the user and auth I passed to it. If it was bad it would complain but I did try that.

Thanks,

Rebecca

@RAKedz RAKedz closed this as completed Jul 9, 2018
@RAKedz RAKedz reopened this Jul 9, 2018
@RAKedz
Copy link
Author

RAKedz commented Jul 9, 2018

Sorry, hit wrong button. ;<

@RAKedz
Copy link
Author

RAKedz commented Jul 10, 2018 via email

@RAKedz
Copy link
Author

RAKedz commented Jul 10, 2018 via email

@RAKedz
Copy link
Author

RAKedz commented Jul 11, 2018

I upgraded faas from 0.8.2 to 0.8.5 and updated the docker-compose.yml for our environment. Here are the specific changes:

gateway:
    ports:
        - 8180:8080

All the networks are replaced with ours.

    networks:
       - saturn

faas-swarm:
environment:
DOCKER_API_VERSION: "1.37"

prometheus:
    ports:
        - 9190:9090

networks:
saturn:
labels:
- "openfaas=true"

I am not sure if I need to do anything with this line. Nginx is our proxy to the Swarm backend. Not sure if I need to be concerned about it.

functions_provider_url: "http://faas-swarm:8080/"

For prometheus.yml

static_configs:
  - targets: ['localhost:9190']

I did not turn on the basic_auth since we use Nginx with basic auth turned on and Let's Encrypt. We do have to login first.

faas-cli login -u user -p password --gateway

In regards to the firewall on the backend where Swarm resides it is set up like this:

Anywhere/esp on eth1 ALLOW Anywhere/esp # Docker Swarm Encryption
2377/tcp on eth1 ALLOW Anywhere # Docker Cluster Management
7946 on eth1 ALLOW Anywhere # Docker Comunication Among Nodes
2375/tcp on eth1 ALLOW Anywhere # Docker daemon remote api un-encrypted
10.132.89.61/esp ALLOW Anywhere # Docker Swarm Encryption IPSec esp 50
2376/tcp on eth1 ALLOW Anywhere # Docker TLS
4789/udp on eth1 ALLOW Anywhere # Docker Swarm Overlay Network Traffic VXLAN
8180/tcp on eth1 ALLOW Anywhere # OpenFaaS Gateway UI
9190/tcp on eth1 ALLOW Anywhere # OpenFaaS Prometheus Metrics

This is just some more details. I still get the same error of the image not found.

-Rebecca

@RAKedz
Copy link
Author

RAKedz commented Jul 18, 2018

Wanted to give some status on this as I try to work this out.

I decided to install the pass credential helper/manager. It worked but it didn't work with the Swarm nor with OpenFaaS stating it couldn't find the image.

I described it more here:

moby/moby#24940

I also came across this:

#87

and tried using it in the stack.yml within the functions:

"registryAuth": "base64user:password"

It didn't work either.

I am starting to get frustrated, but hoping some light will show up at the end of all this.

Thanks

@alexellis
Copy link
Member

@RAKedz

I am starting to get frustrated

This is unfortunate and it's challenging for us to help given the limited information. We do not have access to your environment and don't know how you configure Nexus so cannot test the scenario that you're running into. It sounds like there are more moving parts than we'd typically expect in the configuration.

Did you set up the minimal droplet configuration we asked for? That might be a good next step (suggested 15 days ago)

I decided to install the pass credential helper/manager. It worked but it didn't work with the Swarm nor with OpenFaaS stating it couldn't find the image.

I am not sure what pass is or why it would help.

I am curious on how the command would work
faas deploy -f ./stack.yml --network mynetwork --send-registry-auth

The command --send-registry-auth or -a picks up the auth string from the MacOS credential helper or from the config.json file if it has no credential store set up.

This has been tested with the Docker Hub and with GitLab. We recently added a fix for a nested-repo scenario found only in GitLab, I don't know if it's related but the fix was made by @johnmccabe for @tarunmangukiya.

There are two ways I think you could test out the auth with Swarm + OpenFaaS for a private registry

  • --send-registry-auth
  • curl with passing the registryAuth value

When you send this over curl I believe the value should be calculated as:

export USERNAME="alexellis2"
export PASSWORD="secret"
echo -n "$USERNAME:$PASSWORD" | base64

You should also try creating a private repo on the Docker Hub, just so you can see the auth working properly.

Alex

@tarunmangukiya
Copy link
Contributor

As @johnmccabe suggested to me is to try out below steps to check if that's issue with faas-cli
The parameter name is registry_auth not registryAuth

  1. Add registry_auth to your stack.yml function, registry_auth: base64 of username:password
  2. Clear all the images gateway server of that particular function, so that gateway tries to fetch latest image from registry
  3. Check that you've the respected registry in your docker config.json, i.e. your myregistry.com:28833
  4. Deploy using faas-cli deploy -a and wait for some time as it'll pull the full image.

Also, I've seen that in case of http (non-secure), when faas-cli is able to send auth successfully, it shows the warning of using https. This point helped me to identify that there's problem with faas-cli sending auth (not from OpenFaaS).

Also, @alexellis @johnmccabe, is it possible to show user an error that OpenFaaS is unable to pull your private registry? May be like --verbose mode which shows all the deploy activities?

@johnmccabe
Copy link
Contributor

johnmccabe commented Jul 22, 2018

You can reproduce this by deploying nexus with https://github.com/sonatype-nexus-community/docker-nginx-nexus-repository (deploy a hosted repo on 5000 rather than the proxied), updating the localhost with a valid local domain (I run internal DNS so have a nexus deployed in the lab on nexus.lab.johnmccabe.net), if running OpenFaaS on a mac you'll want to add the nexus.lab.johnmccabe.net:5000 repo to the insecure registries in the Daemon pane of the Docker menu (or whatever the equivalent is for your env).

I can then push/pull with an auth'd docker client.

faas deploy fails similarly to @RAKedz's observations so there looks to be something odd with Nexus here as other registries have no observed issues.

rin66w2se4a6j1vsnikgudihv    \_ gitlabfn.1      nexus.lab.johnmccabe.net:5000/gitlabfn:latest   linuxkit-025000000001   Shutdown            Rejected 7 minutes ago   "No such image: nexus.johnmccabe.net:5000/gitlabfn:latest"

faas push works.

$ faas push
[0] > Pushing gitlabfn.
The push refers to repository [nexus.lab.johnmccabe.net:5000/gitlabfn]
34e362060cdf: Layer already exists
27ba4b53c164: Layer already exists
9138c3bf1eb6: Layer already exists
e64ec8c06ef2: Layer already exists
d600943d4d70: Layer already exists
330717863ab4: Layer already exists
717b092b8c86: Layer already exists
latest: digest: sha256:0510b4200cba7da0eeb82735fa1b881336a5c18e5546444884e1f424431e02e0 size: 1785
[0] < Pushing gitlabfn done.
[0] worker done.

@tarunmangukiya this is a different issue to the one you'd encountered, the creds are getting picked up correctly in this case.

@johnmccabe
Copy link
Contributor

fwiw the nexus registry is a V2 docker registry.

version - Nexus OSS 3.13.0-01

@RAKedz
Copy link
Author

RAKedz commented Jul 22, 2018

@tarunmangukiya and @johnmccabe

I emailed @alexellis a username:password to our private registry to test with. I can forward that email on to anyone who wants to reproduce my issue.

The stack.yml now contains this:

provider:
  name: faas
  gateway: https://myopenfaas.com
functions:
  nodeinfo:
    lang: dockerfile
    skip_build: true
    handler: ./func-nmap
    image: myregistry.com:28833/mynodeinfo:latest
    registry_auth:  "bmblahblahblahyippyblahblahJANDU="
    environment:
      read_timeout:   60
      write_timeout:  60
      write_debug: true
    labels:
      com.openfaas.scale.min: 5
      com.openfaas.scale.max: 20
      com.openfaas.factor:  20
    constraints:
      - "node.platform.os == linux"

When I issue the command:

faas deploy -f ./stack.yml --network saturn --send-registry-auth

It will only work if the image has already been pulled across the swarm. I can do any other command with faas on the private registry, but not when you deploy to the swarm faas -deploy.

The registry will work directly with any docker command like this:

docker service create --name mynodeinfo --network saturn --mode global --with-registry-auth myregistry.com:28833/mynodeinfo:latest

@alexellis
Copy link
Member

Hi @RAKedz having looked into this with @johnmccabe we believe that you may need to pass a full path for the function rather than using a repo directly on the registry.

Please can you try the following?

registry.com:port/user/function:tag

This adds the /user/ part which was missing from your example.

@alexellis
Copy link
Member

I've created a patch, so building the faas-cli from master should also fix the original issue with the way you were using Nexus without a username prefix.

openfaas/faas-cli#489

@RAKedz
Copy link
Author

RAKedz commented Aug 2, 2018

@alexellis and @johnmccabe I will try out the /user/ I use for the registry and then will try the #489 patch when it gets closed which doesn't require the /user/.

Thanks so much for all your effort. Will keep you posted.

@RAKedz
Copy link
Author

RAKedz commented Aug 5, 2018

@alexellis I was able to make it work by using the /user/ and this is what I had to do:

Tag the image from myregistry.com:28833/mynodeinfo:latest to myregistry.com:28833/myuser/mynodeinfo:latest and push it back into the registry.

Then changed stack.yml to:

provider:
  name: faas
  gateway: https://myopenfaas.com
functions:
  nodeinfo:
    lang: dockerfile
    skip_build: true
    handler: ./func-nmap
    image: myregistry.com:28833/myuser/mynodeinfo:latest
    registry_auth:  "bmblahblahblahyippyblahblahJANDU="
    environment:
      read_timeout:   60
      write_timeout:  60
      write_debug: true
    labels:
      com.openfaas.scale.min: 5
      com.openfaas.scale.max: 20
      com.openfaas.factor:  20
    constraints:
      - "node.platform.os == linux"

Issued the command faas deploy -f ./stack.yml --network saturn --send-registry-auth

Then checked the service docker service ps nodeinfo and saw that all 5 were running on different nodes. I did make sure the image was not on any of the nodes before doing it.

Thanks so much for helping me with this.

@alexellis
Copy link
Member

So glad we could get to the bottom of this! Thank you for working with us.

The new CLI version will support images without a user prefix, but I'd recommend using one anyway. It could be something like system or payroll etc.

@alexellis
Copy link
Member

Derek close: resolved

@derek derek bot closed this as completed Aug 7, 2018
@alexellis
Copy link
Member

@RAKedz please update your CLI version to latest.

Alex

@RAKedz
Copy link
Author

RAKedz commented Aug 22, 2018

@alexellis I updated the client and tried using an image without the /myuser/ and now I get:

faas deploy -f ./stack.yml --network saturn --send-registry-auth
Deploying: nodeinfo.

Unexpected status: 400, message: Invalid registry auth

Function 'nodeinfo' failed to deploy with status code: 400

I put the /myuser/ back and now it deploys as expected.

@alexellis
Copy link
Member

Ok.. thanks for updating. Please can you just use a prefix all the time, it doesn't have to be the user - it can also be for instance "payroll" as a kind of namespace? Alex

@RAKedz
Copy link
Author

RAKedz commented Aug 22, 2018

@alexellis That means a 'payroll' can be use though it may not be a actual user in the registry, it's just a placeholder?

@RAKedz
Copy link
Author

RAKedz commented Aug 22, 2018

I also had to use our existing network and I had to change the openfaas docker-compose.yml as described here:

Use a pre-existing network
If you want your containers to join a pre-existing network, use the external option:

networks:
  default:
    external:
      name: my-pre-existing-network

After doing those two things I got the function to respond.  Now that I know the platform is working Lyndon can work on the application to create some functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants