-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Cloudflare R2 file storage as an optional feature #465
Conversation
I’ll leave comments inline on the PR, but just to comment on the overall direction. I think it’s absolutely right to implement this as a configurable option via feature flag, but keep the default to be postgresql. For anyone running the app just for themselves, postgres is simpler setup. This S3-like option is really just for larger deployments. |
docker-compose.yml
Outdated
@@ -10,6 +10,8 @@ services: | |||
POSTGRES_PASSWORD: secret | |||
volumes: | |||
- hostedgpt_pgdata:/var/lib/postgresql/data | |||
ports: | |||
- 5433:5432 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do? The docker compose has been working for me without this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not necessary, but useful if you want to connect to the postgres instance that's running in docker from your local machine via tools like postico. Happy to remove it if you want. I was just using it to look at my database
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, that makes sense. I think I used to have a couple ports exposed and then someone submitted a PR to remove them saying: it’s exposing ports which are conflicting with my local services and those ports arent being used.
So i think it helps ppl for us to expose less
config/storage.yml
Outdated
access_key_id: <%= ENV["CLOUDFLARE_ACCESS_KEY_ID"] || Rails.application.credentials.dig(:cloudflare, :access_key_id) %> | ||
secret_access_key: <%= ENV["CLOUDFLARE_SECRET_ACCESS_KEY"] || Rails.application.credentials.dig(:cloudflare, :secret_access_key) %> | ||
region: auto | ||
bucket: <%= ENV["CLOUDFLARE_BUCKET"] || Rails.application.credentials.dig(:cloudflare, :bucket) %> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these should be Setting.
and then the “ENV || *.credential” happens inside options.yml
config/environments/test.rb
Outdated
@@ -36,9 +36,10 @@ | |||
# Disable request forgery protection in test environment. | |||
config.action_controller.allow_forgery_protection = false | |||
|
|||
# Store uploaded files on the local file system in a temporary directory. | |||
# Store uploaded files in the db for testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just delete this comment, especially when it’s already out of sync with storage.yml. :) People can just check storage.yml if they want to know how test is storing files.
I’m definitely in the camp of “comment very little unless it’s really needed to clarify because inevitably the comments will get out of sync from the code.” This is one of those comments that was generated by rails with the original “rails new” and I haven’t gone through the codebase and removed those generated comments which have become irrelevant.
But actually, aside from this comment, don’t you want to remove active_storage.service from this file entirely since it’s now within application.rb?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah agreed with the comments. If the code explains itself without too much trouble then no need for further explanation with comments.
For this file I purposely left it here thinking that we should override anything that would be set in the application.rb config file here since I don't think we want to upload anything to cloud storage during tests. At least that's my initial thought. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, we don't want to upload to cloud storage during tests. But since the default is postgres, we shouldn't need to override this for test and instead can use the postgres default in the test environment. I think it's a good thing that tests will use the postgres adapter because if we ever inadvertently break that adapter we'll notice failing tests.
config/environments/test.rb
Outdated
config.active_storage.service = :test | ||
config.active_storage.variant_processor = :mini_magick # vips is better but does not work on Mac Apple Silicon so using this in dev & test | ||
# vips is better but does not work on Mac Apple Silicon so using this in dev & test | ||
config.active_storage.variant_processor = :mini_magick |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it’s worth reverting this. Even though it makes the line really long, which is a bit annoying, the few times I do add comments I try to bias towards having them on the actual line they’re explaining rather than their own line, only because it slightly decreases the frequency that someone updates a line without reading the comment.
I feel like I stated that in a confusing way. All I’m trying to say is: if the comment is on its own line, it’s more likely to get out of sync. So let’s keep it connected to the mini_magick thing even though it makes the line really long.
config/application.rb
Outdated
if ENV["CLOUDFLARE_STORAGE_FEATURE"] == "true" | ||
config.active_storage.service = :cloudflare | ||
else | ||
config.active_storage.service = :local |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should rename “local” to “database”, for clarity. It looks like I changed the backend for “local” within storage.yml, since it was previously referring to local file system, but failed to give it a more logical name. oops. :)
@@ -44,6 +44,7 @@ gem "tiktoken_ruby", "~> 0.0.6" | |||
gem "solid_queue", "~> 0.2.1" | |||
gem "name_of_person" | |||
gem "actioncable-enhanced-postgresql-adapter" # longer paylaods w/ postgresql actioncable | |||
gem "aws-sdk-s3", require: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not familiar with require: false, I just looked it up and it sounds like when this is false the you need to explicitly set “require …” in the file that uses it. Is that the intent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting. Maybe rails requires it explicitly internally. Just going based on the guides here: https://guides.rubyonrails.org/active_storage_overview.html#s3-service-amazon-s3-and-s3-compatible-apis
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's interesting, maybe keep it then until you run into problems and then you'll either solve the problem by doing an explicit "require ..." or removing this flag.
Gemfile options is one thing I always mean to go deep on and truly understand, but I've still not done it yet. So I get confused myself on the various version specifications (or lack there of) and other options. Now that I know about "require false" I could probably switch a few gems to use that since there are some I only use in a single file within the app.
config/application.rb
Outdated
@@ -33,6 +33,13 @@ class Application < Rails::Application | |||
config.time_zone = "Central Time (US & Canada)" | |||
config.eager_load_paths << Rails.root.join("lib") | |||
|
|||
# Active Storage | |||
if ENV["CLOUDFLARE_STORAGE_FEATURE"] == "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't use Feature.
here because application hasn't finished starting yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhh, shoot. Hmm. I think then still add it to Feature and options.yml, just in case other parts of the app need to check it, and then here just repeat the “ENV || application…” which you put in the options.yml
Co-authored-by: Keith Schacht <krschacht@gmail.com>
config/storage.yml
Outdated
@@ -1,9 +1,27 @@ | |||
# TODO: remove test, local, and local_public options from this file | |||
# figure out how to migrate existing records to handle that change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ran into some issues with existing database records when I changed local
to database
so I left the existing test
, local
, and local_public
here for now and added this TODO. I don't plan to leave this here, but will come back tomorrow and finish. I can either revert the name changes or do a database migration to try to update existing data with the new names. I would definitely want someone with more test data to try the migration if that's the route we take though since I have almost nothing in my db at this point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. And for trying things out in development, it really helps to load the fixture data into your dev database. I used to have the seeds file configured to do this but it caused some issues in production. This is an old commit back when it was still working:
https://github.com/AllYourBot/hostedgpt/blob/f66ee3b8d2bac67acfe7f8a174f29930d7176af6/db/seeds.rb
I think this should work as is, although it may be missing a few more recent database tables so check schema.rb and maybe add the missing items to the array. Note: the array is ordered so tables referenced by other tables are loaded first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, after you update this file, you run bin/rails db:seed
to load it. If that doesn't work, you maybe have to run bin/rails db:drop; bin/rails db:prepare
. I vaguely remember having some issues with referential integrity.
So, if that still doesn't work, then maybe copy and paste the referential integrity hack from this version of the seeds file:
https://github.com/AllYourBot/hostedgpt/blob/3de36588ca922a22edd232c70fe27967576e78ad/db/seeds.rb
I really probably should have preserved this "load fixtures" trick but moved it into a rake task. If you get this working, maybe when you're done with this PR we tee up another PR that turns this into a clean rake task. The only reason I removed this is because it was causing some issues with the various production hosting environments due to stricter postgres permissions, but this is quite helpful in development.
I think this is all good, but I'm working on running the seeds to verify. I was able to create some files on the main branch and then checkout this branch, see it breaking, and then migrate to see it working again. If we're ok with that then feel free to merge, but otherwise I'll try to have the seeds working before I sleep tonight or tomorrow |
Hi Justin, I had a busy day today and didn’t get a chance to take a look,
but I should be able to tomorrow AM before you start (I’m still in Europe).
…On Thu, Jul 18, 2024 at 2:51 PM Justin Vallelonga ***@***.***> wrote:
I think this is all good, but I'm working on running the seeds to verify.
I was able to create some files on the main branch and then checkout this
branch, see it breaking, and then migrate to see it working again. If we're
ok with that then feel free to merge, but otherwise I'll try to have the
seeds working before I sleep tonight or tomorrow
—
Reply to this email directly, view it on GitHub
<#465 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAIR5JNY72U3T5WWUNXF4TZM7B4LAVCNFSM6AAAAABLAROCCCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZWGU4DKMRXGM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
@jlvallelonga I gave this one more read through before merging in. I didn't love the special-casing of this one feature being ENV only (and needing to explain that) so I figured out how to make While I was in there, I did a close read-through of all the optional features we cover within the README and I did copy editing of that to try and streamline it a bit. While doing this, I decided to remove any mention of the rails credentials file. We'll still support it, but I'll consider that an advanced thing for people who already know how rails credentials work. For everyone else, it's easiest and more standard to simply set ENV. |
nice! definitely more clean that way |
First week, first PR merged in 🍾 🎉 It’s always nice to get something full cycle. Now we’re off! |
Adds Cloudflare R2 as option for storing files with Active Storage.
Task from here: https://github.com/AllYourBot/hostedgpt/wiki/Draft-project-plan
Starting this as an optional change, but happy to modify this to make it a requirement or change it to be the default if we want.