Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redirect when connecting to MinIO #21

Closed
datawookie opened this issue Dec 1, 2021 · 18 comments
Closed

Redirect when connecting to MinIO #21

datawookie opened this issue Dec 1, 2021 · 18 comments

Comments

@datawookie
Copy link
Contributor

Hi @espebra,

I'm pushing ahead with getting the local Docker environment up and running and documenting the process so that you can include it in the README.

I'd like to be able to MinIO on localhost:9000 as indicated in the current README. However, when I browse to that URL I immediately get redirected to another port (which is not exposed).

image

I don't know anything about MinIO, so I'm afraid that I can't fix this.

Thanks, Andrew.

@datawookie
Copy link
Contributor Author

datawookie commented Dec 1, 2021

Hi @espebra,

Following on from the above (I suspect that this might be related). I've set up the following environment variables:

S3_ENDPOINT="localhost:9000"
S3_REGION="eu-west-1"
S3_BUCKET="my-secret-filebin"
S3_ACCESS_KEY="xxx"
S3_SECRET_KEY="xxx"

Now when I run Docker Compose I get the following error:

image

The MinIO service is up and running at this point, so I suspect that the problem here lies more with the communication to S3.

Thanks for your help with this. It seems that getting this up and running is not a trivial process and I'm very happy to add to the documentation the details (once I have worked them out myself).

Since I'm sure that this is just a configuration issue on my side, would you be open to a quick call to iron this out? Then I can write up the documentation and make a PR.

Thanks, Andrew.

@Nenuial
Copy link

Nenuial commented Dec 5, 2021

I can confirm that it was not trivial to get it running on my own Docker instance.
I used my server's ip for the DATABASE_HOST and S3_ENDPOINT environment variables. This solved the issues when running docker compose.

@datawookie
Copy link
Contributor Author

Thanks, @Nenuial. Okay, so it's good to know that I'm not the only person battling with this. 👍

Would you mind sharing a suitably obfuscated copy of your environment variables? I've already burned a number of hours on this and I'd really like to get this working and documented so that I can move on.

@Nenuial
Copy link

Nenuial commented Dec 5, 2021

Here's the docker-compose.yml that worked for me except for the #22 issue:
https://gist.github.com/Nenuial/5c20cff2d3b4334e72b7f5ed0f1ead10

@espebra
Copy link
Owner

espebra commented Dec 5, 2021

I'd like to be able to MinIO on localhost:9000 as indicated in the current README. However, when I browse to that URL I immediately get redirected to another port (which is not exposed).

Minio now starts the console on a random high port, so I specified port 9001 in the compose file in 7366738. Does this solve the problem?

@espebra
Copy link
Owner

espebra commented Dec 7, 2021

Hi @espebra,

Following on from the above (I suspect that this might be related). I've set up the following environment variables:

S3_ENDPOINT="localhost:9000"
S3_REGION="eu-west-1"
S3_BUCKET="my-secret-filebin"
S3_ACCESS_KEY="xxx"
S3_SECRET_KEY="xxx"

Now when I run Docker Compose I get the following error:

image

The MinIO service is up and running at this point, so I suspect that the problem here lies more with the communication to S3.

There is a docker-compose.yml file in the repository already that spins up an environment that works (should work), and it includes MinIO and Filebin as two separate containers with environment variables required for communication to work. Can you use this as a basis when you create your own?

@datawookie
Copy link
Contributor Author

Hi @espebra and @Nenuial, I've made some progress with this but run into a roadblock. I've made some changes and pushed them to a forked copy of the repository at https://github.com/datawookie/filebin2.

Specifically this is what I have done (all updates in docker-compose.yml):

  • MinIO is now running as an S3 gateway (this means that the files are actually being uploaded to my S3 bucket!) and
  • environment variables are now being picked up from the environment (also reduced the number of variables that you need to define by using the S3 credentials in multiple places).

You'd launch Docker Compose as before but need to set up your environment variables first. I created an .env file for this purpose with the following contents:

S3_ENDPOINT=storage:9000
S3_REGION=us-east-1
S3_BUCKET=datawookie-filebin
S3_ACCESS_KEY=AKIA2BRYTBPZMK3HGBQK
S3_SECRET_KEY=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

Don't worry, I've obfuscated the S3 credentials.

When I upload a file it gets transferred to S3 (I can see this in my S3 console on AWS as well as in MinIO on localhost:9000). So that parts works very nicely. :)

I can also download an archive of the files in a bucket! 👍

But when I try to download individual files it breaks. Could you please take a look? I feel like we are close to making the setup a lot simpler. Just one minor (?) technical hurdle to clear.

Thanks, Andrew.

@datawookie
Copy link
Contributor Author

Hi @espebra @Nenuial, any feedback on ☝️? I'd really like to either sort this out in the next couple of days or I'm going to have to move onto other things. I really feel that having this Docker setup well documented would be a very useful addition to the project. However, I've got a lot of other things on my plate and I can't afford to spend a lot more time on this. So your feedback would be super helpful! Thanks, Andrew.

@espebra
Copy link
Owner

espebra commented Dec 11, 2021

When you click download on an individual file, you'll get a redirect back from filebin. The location header in this redirect points to a valid URL based on the S3_ENDPOINT value. The URL is generated using the PresignedGetObject function call.

Which kind of URL do you get when you download an individual file and what is the value of the S3_ENDPOINT environment variable?

@espebra
Copy link
Owner

espebra commented Dec 11, 2021

My hypothesis is that the S3_ENDPOINT value you use can be resolved by the server running filebin but is not reachable/resolvable for the client. Given that filebin is using presigned URLs for clients to download files, the clients must be able to connect directly to the object storage system. If you run MinIO as the object storage, then you'll need to put the server's public hostname as the S3_ENDPOINT and not localhost to allow clients to connect to it.

If you want to see how it works on filebin.net; the S3_ENDPOINT is set to situla.bitbit.net, which the clients connect directly to for file downloads.

@datawookie
Copy link
Contributor Author

datawookie commented Dec 12, 2021

Hi @espebra,

Okay, thanks. I think I am finally getting somewhere.

Local Setup

Using the same setup as before.

When I try to download a file directly from the Filebin console (locally hosted) the CURL command looks like this (after trimming off the parameters):

curl 'http://storage:9000/datawookie-filebin/\
cef631c791d58ecba9f0ebf43118c56566e46f4a616119b67f3e6a0a49c8e4cc/\
d937494413acdc9c1fc0c60ddbea5d2e8df91b250086ab21945b468288be0fd3'

I can see the values of S3_ENDPOINT and S3_BUCKET I specified in my environment. This fails with "Server not found".

@espebra is this the "resigned" URL you were referring to?

This URL doesn't work because my browser cannot resolve storage into an IP address (it only has meaning within the Docker network). If I simply replace storage with 127.0.0.1 then I get back from XML indicating that there's a problem with a signature. This feels like a step in the right direction.

image

If I got to the MinIO console (on 127.0.0.1:9000) and login using the S3 credentials then I can navigate to the specified bucket, find the file and download. Download is fine and this is the equivalent CURL command (after trimming off the parameters).

curl 'https://datawookie-filebin.s3.eu-west-1.amazonaws.com/\
cef631c791d58ecba9f0ebf43118c56566e46f4a616119b67f3e6a0a49c8e4cc/\
d937494413acdc9c1fc0c60ddbea5d2e8df91b250086ab21945b468288be0fd3'

This looks similar to the previous command, but has a different host (S3 versus local) and the latter doesn't have the

Variations

If I start with S3_ENDPOINT=storage:9000 then this is what I see in the logs:

s3_1   | API: http://172.20.0.3:9000  http://127.0.0.1:9000 
s3_1   | Console: http://172.20.0.3:9001 http://127.0.0.1:9001 
app_1  | Established session to S3AO at storage:9000
app_1  | Found S3AO bucket: datawookie-filebin

So the application in this case appears to have access to the bucket. 👍

If I start with S3_ENDPOINT=127.0.0.1:9000 then this is what I see in the logs:

s3_1   | API: http://172.20.0.3:9000  http://127.0.0.1:9000 
s3_1   | Console: http://172.20.0.3:9001 http://127.0.0.1:9001 
app_1  | Established session to S3AO at 127.0.0.1:9000
app_1  | Unable to check if S3AO bucket exists: Get "http://127.0.0.1:9000/datawookie-filebin/?location=":
  dial tcp 127.0.0.1:9000: connect: connection refused
app_1  | Unable to initialize S3 connection: Get "http://127.0.0.1:9000/datawookie-filebin/?location=":
  dial tcp 127.0.0.1:9000: connect: connection refused

If I start with S3_ENDPOINT=92.16.17.118:9000 (using my machine's external IP address) then this is what I see in the logs:

s3_1   | API: http://172.20.0.2:9000  http://127.0.0.1:9000 
s3_1   | Console: http://172.20.0.2:9001 http://127.0.0.1:9001 
app_1  | Established session to S3AO at 92.16.17.118:9000
app_1  | Unable to check if S3AO bucket exists: Get "http://92.16.17.118:9000/datawookie-filebin/?location=":
  dial tcp 92.16.17.118:9000: connect: connection refused
app_1  | Unable to initialize S3 connection: Get "http://92.16.17.118:9000/datawookie-filebin/?location=":
  dial tcp 92.16.17.118:9000: connect: connection refused

In neither of these cases does the application see the bucket. 👎 Why? Well using 127.0.0.1 is not going to work because 127.0.0.1 means something different within the container. It's not referencing the actual host but the container itself.

Investigating https://filebin.net/

Looking at the same information from https://filebin.net/, this is what I see when downloading:

curl 'https://situla.bitbit.net/filebin/\
01b5383630f3cf020117be853ef1b4622d7f3314ee55687ef03265cddb84050c/\
9d4d4b776754e02ee93341c06182e392facf0b49c993374f4e27d06a76137602'

From this I infer that you have the following settings:

S3_ENDPOINT=situla.bitbit.net
S3_BUCKET=filebin

This agrees with what you mentioned ☝️.

Conclusion

Why is this so hard? I feel like there is some vital piece of information that I don't have.

A question:

  • Are you running Minio as server or gateway? The original docker-compose.yml uses server but in this case it doesn't seem to actually connect to S3. Or am I missing something? At least with gateway I am getting files actually uploaded to my bucket.

Sorry, I know that this is a lot of information. I've spent a lot of time trying to figure this all out. I'd really appreciate some more help resolving this. I really feel that people are not going to host Filebin themselves if it's this hard to get it up and running. This would be a shame because Filebin is amazing! Please help me to get this up and running so that I can document the process and make it easy for others to do the same.

Thanks, Andrew.

@espebra
Copy link
Owner

espebra commented Dec 12, 2021

When I try to download a file directly from the Filebin console (locally hosted) the CURL command looks like this (after trimming off the parameters):

curl 'http://storage:9000/datawookie-filebin/\
cef631c791d58ecba9f0ebf43118c56566e46f4a616119b67f3e6a0a49c8e4cc/\
d937494413acdc9c1fc0c60ddbea5d2e8df91b250086ab21945b468288be0fd3'

I can see the values of S3_ENDPOINT and S3_BUCKET I specified in my environment. This fails with "Server not found".

@espebra is this the "resigned" URL you were referring to?

Yes, correct. The URL is not resigned, it is presigned - which is a concept in S3 that allows clients to download objects directly from S3 rather than going through a web app (like filebin).

Looking at the same information from https://filebin.net/, this is what I see when downloading:

curl 'https://situla.bitbit.net/filebin/\
01b5383630f3cf020117be853ef1b4622d7f3314ee55687ef03265cddb84050c/\
9d4d4b776754e02ee93341c06182e392facf0b49c993374f4e27d06a76137602'

From this I infer that you have the following settings:

S3_ENDPOINT=situla.bitbit.net
S3_BUCKET=filebin

This agrees with what you mentioned ☝️.

Correct.

Conclusion

Why is this so hard? I feel like there is some vital piece of information that I don't have.

It's a bit rough around the edges and the documentation is lacking a bit, but it seems like you've gotten it to work with AWS S3? As for a local setup I'm not quite sure how to get individual file downloads to work in a local docker-compose with MinIO. It would require the host to have the same mapping of the hostname to the minio container as the app container.

A question:

* Are you running Minio as `server` or `gateway`? The original `docker-compose.yml` uses `server` but in this case it doesn't seem to actually connect to S3. Or am I missing something? At least with `gateway` I am getting files actually uploaded to my bucket.

I'm running Minio as a server in the development environment and with the unit tests. It is a server that provides S3 capabilities locally. On filebin.net I'm using Ceph.

Sorry, I know that this is a lot of information. I've spent a lot of time trying to figure this all out. I'd really appreciate some more help resolving this. I really feel that people are not going to host Filebin themselves if it's this hard to get it up and running. This would be a shame because Filebin is amazing! Please help me to get this up and running so that I can document the process and make it easy for others to do the same.

What is missing now? Getting individual file downloads to work in the development environment in docker-compose.yml?

@Nenuial
Copy link

Nenuial commented Dec 12, 2021

Hi,
I'm sorry I haven't dropped in earlier. The local setup I use seems to work just fine for what I need. Individual file downloads also works just fine.

I haven't tried the new docker-compose.yml setup with the .env file but I can if it helps.

@datawookie
Copy link
Contributor Author

Hi @Nenuial, thanks for letting me know.

The local setup I use seems to work just fine for what I need.

Is this setup pushing files to S3 or are they being stored and retrieved locally?

@datawookie
Copy link
Contributor Author

datawookie commented Dec 16, 2021

It is a server that provides S3 capabilities locally. On filebin.net I'm using Ceph.

Aha! Okay, so you are using MinIO locally to mock and S3 session?

Just to be clear on what I am aiming to achieve: I'd like to document the setup for the following two configurations:

  1. Local testing using a local file store.
  2. Local testing using S3 for file storage.
  3. Deployment using S3 for file storage.

I'd like to provide Docker-based setup for each of these configurations because I think that this is the quickest way that people will be able to just spin it up and use it without having to worry about the intricacies of getting the processes to talk to each other.

I think that (1) is accomplished using the docker-compose.yml as it stands in the repository (now that I understand the purpose and that it's mocking an S3 backend).

I think that I'm close to (2) but just need to figure out how to get individual file downloads to work. Any help with this would be appreciated. @Nenuial if you've cracked this then I'd really appreciate having a chat about how it was done so that we can make it as smooth as possible for new users (this would involve creating a separate docker-compose.yml, I presume, and fleshing out the documentation).

I have not even embarked on (3). Helpful to know that you are using Ceph for this, @espebra. I'll ping you about details once I get to this stage.

Thanks, Andrew.

PS. @espebra I've written an R wrapper for Filebin which has got a lot of interest and which we are using for a few projects. I think that there is the potential for driving a lot of interested users to this project. We just need to lower the barrier to entry a bit!

@Nenuial
Copy link

Nenuial commented Dec 16, 2021

Hi,

Is this setup pushing files to S3 or are they being stored and retrieved locally?

The files are stored locally.

@espebra
Copy link
Owner

espebra commented Dec 19, 2021

It is a server that provides S3 capabilities locally. On filebin.net I'm using Ceph.

Aha! Okay, so you are using MinIO locally to mock and S3 session?

Yes, correct. I'm using MinIO in the dev environment and for unit tests. File download from MinIO worked fine earlier, but apparantly broke when I moved to presigned URLs for downloading directly from S3 instead of proxying file downloads through filebin.

Just to be clear on what I am aiming to achieve: I'd like to document the setup for the following two configurations:

1. Local testing using a local file store.

2. Local testing using S3 for file storage.

3. Deployment using S3 for file storage.

I'd like to provide Docker-based setup for each of these configurations because I think that this is the quickest way that people will be able to just spin it up and use it without having to worry about the intricacies of getting the processes to talk to each other.

Makes sense.

I think that (1) is accomplished using the docker-compose.yml as it stands in the repository (now that I understand the purpose and that it's mocking an S3 backend).

Well, I'm not sure if mocking is the right word. MinIO is an actual S3 compatible object storage service that can run locally. It is not AWS S3 obviously, but most of the AWS S3 APIs work fine with MinIO.

I think that I'm close to (2) but just need to figure out how to get individual file downloads to work. Any help with this would be appreciated. @Nenuial if you've cracked this then I'd really appreciate having a chat about how it was done so that we can make it as smooth as possible for new users (this would involve creating a separate docker-compose.yml, I presume, and fleshing out the documentation).

When you say Local testing using S3 for file storage, do you mean AWS S3 or local S3 via MinIO?

I have not even embarked on (3). Helpful to know that you are using Ceph for this, @espebra. I'll ping you about details once I get to this stage.

Sure.

PS. @espebra I've written an R wrapper for Filebin which has got a lot of interest and which we are using for a few projects. I think that there is the potential for driving a lot of interested users to this project. We just need to lower the barrier to entry a bit!

Cool! I'm not using R myself, but API wrappers are nice.

@espebra
Copy link
Owner

espebra commented Dec 4, 2022

This issue has been inactive for a year now, so I'll go ahead and close it. Feel free to reopen if you'd like to continue.

@espebra espebra closed this as completed Dec 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants