Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assets & Resources URL rules #10477

Closed
2 of 3 tasks
naz opened this issue Feb 11, 2019 · 15 comments
Closed
2 of 3 tasks

Assets & Resources URL rules #10477

naz opened this issue Feb 11, 2019 · 15 comments
Assignees
Labels
affects:api Affects the Ghost API server / core Issues relating to the server or core of Ghost stale [triage] Issues that were closed to to lack of traction

Comments

@naz
Copy link
Contributor

naz commented Feb 11, 2019

Context

Ghost deals with different types of URL:

  1. assets (images, js files, css files, maybe one day movies/pdfs/who knows) in fields
  2. assets in the HTML / content fields
  3. resource urls: the url for a post, a tag or an author - generated by Ghost
  4. relative paths in the HTML / content fields

At the moment Ghost treats all 4 of these the same. They are all returned as absolute by default sinse API v2.

Assets

Assets that are served by Ghost, i.e. anything that has a path that starts with {subdir}/content/images/... needs to always be an absolute URL. The file was uploaded to Ghost, it will be served from Ghost. All assets MUST always be served as an absolute URL from any API. This means that any content/images url should appear absolute anywhere other than in the DB.

Long term, we should either use a separate API endpoint, or treat /content/images as an API endpoint. URLs for assets shouldn’t change.

Resource URLs & Relative Paths

Resource URLs also need to appear absolute via the API by default. Right now the url helper in Ghost takes care of changing this back for the one case (Ghost themes).

Paths in content should not be manipulated unless specifically requested. The absolute_urls=true flag should be reinstated to cover just this case.

Fixes needed to conform to rules above

  • Transform absolute to relative paths in mobiledoc/html Absolute paths in html field #10472
  • Do an audit of all URLs that are used in Ghost and stored in the db
  • Refactor absolute -> relative transformation from API layer to model layer
@naz naz added server / core Issues relating to the server or core of Ghost affects:api Affects the Ghost API cleanup labels Feb 11, 2019
naz added a commit to naz/Ghost that referenced this issue Feb 11, 2019
refs TryGhost#10477
closes TryGhost#10472

- Adds transformation for any asset absolute URL's into relative used in mobiledoc
@kirrg001
Copy link
Contributor

We had two bug reports in the last months because we've suddenly stored absolute asset urls in our database. It was caused by transforming incoming absolute asset urls on API layer into relative urls on API layer. We have just forgotten to add a protection for that. It happened once for posts and now for settings (#10590).

should appear absolute anywhere other than in the DB.

The model layer does not protect against storing absolute paths. This needs sorting out asap @gargol

@kirrg001
Copy link
Contributor

Regarding Audit:

Are there any other resources which have url fields, which could potentially get stored absolute?

@naz naz mentioned this issue Mar 12, 2019
6 tasks
@naz
Copy link
Contributor Author

naz commented Mar 13, 2019

Results of an audit for URL related fields we currently use. Fields without any comments do input/output serialization and transform local absolute URLs into relative ones:
posts:

  • feature_image
  • og_image
  • twitter_image
  • mobiledoc (url transformation for input content, NO output transformation)
  • html (url transformation for content in both input & output)
  • canonical_url
  • url (virtual)

users

  • profile_image
  • cover_image
  • website? (no transformation, but could be in case somebody puts a url to local instance e.g.: website: http://ghost.local/author/naz)
  • url (virtual)

tags

  • feature_image
  • url (virtual)

settings

  • cover_image
  • icon
  • logo

subscribers

  • subscribed_url
  • subscribed_referrer (not transformed)
  • unsubscribed_url (doesn't seem to be used in codebase, should we remove it?)

webhooks (no transformations)

  • target_url (should transform?)

Non DB resources:
images

  • url (output transormation from local path)

Would suggest moving transformation handling for non virtual url-containing fields (marked in bold) into model layer to keep the logic more central. The rest of fields needs discussion/feeback. cc @kirrg001

@kirrg001
Copy link
Contributor

Let's discuss on Monday 👍

@kirrg001
Copy link
Contributor

kirrg001 commented Mar 19, 2019

#10477 #10620 is related.

@naz
Copy link
Contributor Author

naz commented Mar 20, 2019

@kirrg001 think you meant some other issue as the above is linking back to here 😅

@ErisDS
Copy link
Member

ErisDS commented Mar 21, 2019

Related: #10069 and #10598

From #10069 we can see there is not detail in our URL types:

When we talk about assets, we mostly mean images served from Ghost, e.g. they start with /content/images/. These are served from Ghost/belong to Ghost, we therefore "own" this URL, and should serve it as relative.

When we look at URLs inside of content (and also code injection!):

There are also theme assets, which may have a relative path, and be referenced in content. Ghost also owns these assets, but it may be too hard to detect them right now. If possible, they should be served absolute.

Additionally, there are external assets, which may have a relative path, and be nothing to do with Ghost. We don't own these URLs, we should not change them.

This rule guide needs to take into account what happens:

  • In storage
    • TLDR here is only external URLs should be absolute, everything else MUST be stored relative.
  • In the API
    • Here the behaviour is different for the type of URL
  • In clients (e.g. SDK, or theme helpers)
    • Here the behaviour is different for the type of URL, and depending on whether the client is external (SDK) or internal (theme helpers).

I know this is an active concern, but reinforcing that I am hitting my head against the lack of clarity in these rules again today.

It looks to me like the absolute_urls parameter still exists in the v2 API, but I don't know what it does anymore, or whether it's useful or should have been removed.

Once we have the rules set, then I can at least document the rules as they are meant to be, and we can look for places where we are not following it.

kirrg001 added a commit to kirrg001/Ghost that referenced this issue Mar 22, 2019
refs TryGhost#10620, refs TryGhost#10477

- editor uploads an image
- receives absolute path
- inserts into mobiledoc
- updates post
- server stores relative asset paths
- we need to serve absolute asset paths, otherwise editor could mark content as "dirty"
@kirrg001
Copy link
Contributor

Summary from today:

  • the editor suffered from a bug, because we transform mobiledoc from absolute to relative internal images urls for storage
  • but server serves relative internal image paths
  • editor marks the changes as "dirty"

We tried fixing:

Both did not work 😈

We need a whole new plan!

@stale
Copy link

stale bot commented Jun 30, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale [triage] Issues that were closed to to lack of traction label Jun 30, 2019
@naz
Copy link
Contributor Author

naz commented Jul 1, 2019

Not stale. In progress by @kevinansfield 😉

@stale stale bot removed the stale [triage] Issues that were closed to to lack of traction label Jul 1, 2019
kevinansfield added a commit to kevinansfield/Ghost that referenced this issue Oct 7, 2019
refs TryGhost#10477

The unsaved changes modal is displaying even when the post has been saved if images have been uploaded because the server is transforming absolute image urls to relative during input of the `mobiledoc` field but not transforming them back to absolute during output. The editor then thinks it's out of sync and shows the warning when trying to leave.

- `@tryghost/url-utils` has been updated with new methods for transforming URLs in mobiledoc content
- moves absolute->relative transformation from the API input serializers into the Post model
- transforms URLs in more fields for a more comprehensive transformation and fewer issues when re-configuring a site's domain
  - previously there could be problems with internal links between posts not being transformed so you could change the url config to newdomain.com but links in post content would still be pointing to olddomain.com
- updates the API post output serializers to transform all modified fields
- drops the `?absolute_urls=true` param switch from the `canary` API post output serializer so that all URLs are output as absolute
  - we're transforming more urls to relative when saving so this is necessary to ensure the unsaved changes modal is not triggered
  - the query param isn't documented and will disappear in v3
kevinansfield added a commit that referenced this issue Oct 7, 2019
refs #10477

The unsaved changes modal is displaying even when the post has been saved if images have been uploaded because the server is transforming absolute image urls to relative during input of the `mobiledoc` field but not transforming them back to absolute during output. The editor then thinks it's out of sync and shows the warning when trying to leave.

- `@tryghost/url-utils` has been updated with new methods for transforming URLs in mobiledoc content
- moves absolute->relative transformation from the API input serializers into the Post model
- transforms URLs in more fields for a more comprehensive transformation and fewer issues when re-configuring a site's domain
  - previously there could be problems with internal links between posts not being transformed so you could change the url config to newdomain.com but links in post content would still be pointing to olddomain.com
- updates the API post output serializers to transform all modified fields
- drops the `?absolute_urls=true` param switch from the `canary` API post output serializer so that all URLs are output as absolute
  - we're transforming more urls to relative when saving so this is necessary to ensure the unsaved changes modal is not triggered
  - the query param isn't documented and will disappear in v3
@sampumon
Copy link

Sorry for random rambling, but I don't understand why Ghost must serve assets as absolute urls?

Themes serve their internal assets (js/css/images) relative, and it works just fine.

I can see many issues referred in this issue being caused by transforming between relative and absolute urls.

What would be the issue, if everything served by Ghost would be relative, and there were zero transformations between layers?

Thanks for your good work, eagerly waiting 3️⃣.0️⃣ to arrive 🚚

@naz
Copy link
Contributor Author

naz commented Oct 18, 2019

What would be the issue, if everything served by Ghost would be relative, and there were zero transformations between layers?

@sampumon Relative URLs won't work if the content is being served under completely different domain. To give one example of why: Ghost is a headless CMS. It should support a very basic case for static site generators being pure consumers of data from whatever domain the static site is running on 😃

@naz
Copy link
Contributor Author

naz commented Nov 13, 2019

@kevinansfield do you think we need to keep this issue around or want to reuse it for future work when optimizing how the relative->absolute->relative transformations work?

@stale
Copy link

stale bot commented Feb 11, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale [triage] Issues that were closed to to lack of traction label Feb 11, 2020
@stale stale bot closed this as completed Feb 18, 2020
@astuanax
Copy link

astuanax commented Oct 8, 2022

What would be the issue, if everything served by Ghost would be relative, and there were zero transformations between layers?

@sampumon Relative URLs won't work if the content is being served under completely different domain. To give one example of why: Ghost is a headless CMS. It should support a very basic case for static site generators being pure consumers of data from whatever domain the static site is running on smiley

This is actually the other way around in many cases.
Forcing these absolute URLs will definitely make sure images are not showing up in a secure production environment.

In a production environment, you surely do not want any of the admin sections available,
so they are most likely on a protected or internal domain or even a localhost domain.

Absolute URLs, as a result, will point to a protected or non-existing domain and return a 404 or timeout.
Any solution that involves opening up a secure area will become a security issue.

I want to use static generators so we don't have to expose the back end.
With Ghost this is simply not possible.

It is bad practice to force URLs in relative or absolute mode.
In the end, this is content, and a CMS should never dictate how content should look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects:api Affects the Ghost API server / core Issues relating to the server or core of Ghost stale [triage] Issues that were closed to to lack of traction
Projects
None yet
Development

No branches or pull requests

6 participants