Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Ghost File Storage a.k.a Ghost on Heroku #4600

Closed
ErisDS opened this issue Dec 7, 2014 · 9 comments
Closed

[Discussion] Ghost File Storage a.k.a Ghost on Heroku #4600

ErisDS opened this issue Dec 7, 2014 · 9 comments
Milestone

Comments

@ErisDS
Copy link
Member

ErisDS commented Dec 7, 2014

There is a very (very) old issue called 'More Filestorage Abstraction' #2852, that details some changes we need to make to Ghost to make the file storage layer more easily extensible. #2852 has languished for a while, but is very important, so this discussion is an attempt to come up with a better spec for what we want to do, break it down into smaller issues and hopefully get the work completed.


Background on the types of file Ghost stores:

Currently, there are 2 types of files written by Ghost: images, and data exports (backups). Images are properly handled via the storage class, exports are not. In the near future we also want to add API endpoints and UIs to allow users to upload themes and apps and write those files through Ghost as well.


The problem:

Many PaaS style services, like Heroku, don't have local persistent storage. They tend to completely overwrite the local install whenever a new version of the app is deployed, and so the database and files which change need to live elsewhere. Ghost happily supports configuring an external database via config.js, and we need to do the same or similar for file storage.

We want to make it possible to EASILY extend Ghost to use a 3rd party storage option. E.g. storing images on AWS so that it's easy to install Ghost on Heroku or other PaaS. There's actually a really nice write up on how to do this which includes a storage class for AWS, but you still need to hack Ghost to use it.

This should be possible without any need to hack Ghost core.

The solution:

Ghost has the concept of a storage class, so the hard part is essentially done. What we need now is 1) a standardised place to put 3rd party storage classes and 2) a standardised way to configure Ghost to use a different storage class and provide necessary configuration. Additionally, we'd need a way to validate a storage class and check that it works.

My proposal is as follows:

  1. The standard place to put a storage class should be /content/storage/
  2. The standard way to configure Ghost to use a storage class should be via a storage configuration object.

Using /content/storage/ creates a clear place in user-content-land where users can provide a class to override the default. This is better than just putting a file in core/server/storage because it keeps the upgrade process far easier. Further, if Ghost detects a file in content/storage it could potentially use that to override the default without requiring configuration.

At the moment we have a fileStorage config flag that can turn storage off so that users aren't prompted to upload files if they're using a PaaS environment. I don't really think anyone is using this though! storage is also a nicer name for the config object.

The new storage object could be used to provide all manner of details to both Ghost and the storage class. I can imagine that in future we might want to provide configuration such that images are stored differently to themes & apps. This information would be needed by Ghost, and each storage class would need its individual configuration.

I think that the configuration object inside of storage should be keyed on the name of the storage class, so if you have /content/storage/s3.js, your config object would be:

storage: {
   /* core config options here */
   s3: {
     /* s3 config here */
   }
}

This line of storage/index.js could then be changed to always pass the object its own config.

If we want to require config to enable a storage class, rather than just defaulting to using whatever is in /content/storage/ as an override, then something like:

storage: {
   active: 's3'
   s3: {}
}

And in future perhaps:

storage: {
   active: {images: 's3'},
   s3: {}
}

This is how I imagine we can build a system that is really useful right away, and fully extensible in the near future. It doesn't require apps (neatly side-stepping any chicken-and-egg situations there), it is super simple, and I think it works for a lot of use cases other than Heroku. What I'd like is some feedback on my proposal: do you have a better idea? Did I miss something? Do you have a use case that this will or won't work for?

Once the spec is tied down and accepted, then I will start raising smaller issues to hopefully get this work underway!

@phated
Copy link
Contributor

phated commented Jan 16, 2015

👍 We just stumbled on this problem because we are using ghost inside docker and we lose all images when we rebuild and deploy. I think everything is covered in this for our use case.

@phated phated mentioned this issue Jan 18, 2015
@chilts
Copy link
Contributor

chilts commented Jan 24, 2015

The only thing I can see with this proposal is that if someone puts a s3.js file storage script into their content/storage/ directory, then it'll have to somehow come with it's own prerequisites. e.g. aws-sdk. Not sure if this will make this harder for the end user. Am just wondering if we add S3, Google Storage and perhaps Rackspace Cloud Files as defaults then for other storage backends the user can do it themselves? These three would probably cover 90% of cases (just thinking out loud).

@ErisDS
Copy link
Member Author

ErisDS commented Jan 24, 2015

It doesn't really make any sense to add extra dependencies 'just in case' someone wants it, especially when the potential user group is so small (no one is using storage right now, and the original version of this issue has been around for a looong time). If you're configuring Ghost to use an external storage layer, I think that it's acceptable to have to manage a dependency or two yourself.

The planned app system has support for dependencies, so there will be other ways to solve this in future.

@jonatansberg
Copy link

How about managing all the external dependencies for this through a package.json-file/npm instead of trying to solve these issues all over again?

If so, then running npm install --save ghost-storage-s3 and adding the necessary config and/or environment variables should be enough. Assuming you're using ghost as an npm module.

@avsd
Copy link

avsd commented Feb 11, 2015

If we are talking about 3rd-party PaaS, it is possible to work with their API directly from the front-end (EmberJS), without running anything on NodeJS backend. It will make implementation simpler and more transparent. So, only secret keys will be stored on the backend, and propagated to the EmberJS UI.

What do you think about it?

We've already started implementing Uploadcare integration with Ghost this way (I'm sure some of you have seen it already): https://ghost.org/forum/plugins/17929-hosting-images-for-the-blog-on-uploadcare-infrastructure/

@sethbrasile
Copy link

I'm with you @mrlundis. Why not make it standard practice to add a package.json to content/storage/? That way folks can write storage adapter packages and they can be git cloned and npm installed into content/storage/?

@tehnorm
Copy link

tehnorm commented Feb 13, 2015

That is a great idea! NPM can really handle any type of package as long as it conforms to the package.json format. IE - one could write a package for Ghost that could be delivered to the end user via npm install. This is a highly flexible approach.

If the worry is about how does an end user configure this new package - why not add a layer to Ghost to make those new config values the installed package requires available to a screen in the Settings UI? You could even add those optional or required config value fields directly the package.json that defines the Ghost package.

http://stackoverflow.com/questions/10065564/add-custom-metadata-or-config-to-package-json-is-it-valid

Potential package.json layout for Ghost packages.

{
        "name" : "ghost-storage-s3",
        "version" : "0.0.1",
        "dependencies": {
                "async": "0.7.0",
                "aws-sdk": "2.0.x"
        }
        "ghost" : {
                "config" : {
                        "aws_key" : {
                                "type" : "text",
                                "name" : "Amazon Web Services Key",
                                "help" : "This is used to setup your connection to configure AWS S3 storage. See: some-url.com for details",
                                "required" : true
                        },
                        "aws_secret" : {
                                "type" : "text",
                                "name" : "Amazon Web Services Secret",
                                "required" : true
                        }

                }
        }
}

Triple good part about this is that the package builders would have the whole power and scope of the node/npm ecosystem at their disposal when building packages. Simply a require() away.

Happy to help further this idea if it seems sane.

@ErisDS
Copy link
Member Author

ErisDS commented May 13, 2015

This is documented on the wiki: Using a custom storage module

@Potherca
Copy link

👍 Awesome stuff!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants