Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasets in production CKAN website still point to development after production implementation #4070

Closed
ply112 opened this issue Mar 5, 2018 · 5 comments
Assignees

Comments

@ply112
Copy link

ply112 commented Mar 5, 2018

CKAN Version if known (or site URL)

2.6.1

Please describe the expected behaviour

We have a large number of datasets uploaded to our development CKAN website, and would like to simply make a copy of those datasets to production by cloning the server.

Please describe the actual behaviour

The server has already been cloned from development. Now in production, the URLs are all pointing to datasets stored on our development server instead of production. We realized that in the "resource" table in postgrsql, the URL column has the url for development server hardcoded; therefore, we are not getting files that reside on production. Is there an easy way to resolve this? We would not want to re-uploading the hundreds of files.

What steps can be taken to reproduce the issue?

On production, it shows the urls for all the datasets are still pointing to development.

@amercader amercader self-assigned this Mar 6, 2018
@amercader
Copy link
Member

@ply112 you are right, at some point full URLs are stored in the database, and that's not what was intended so there is a bug somewhere (the expected behaviour is that only the file name is stored and URLs are generated using the ckan.site_url config option). We need to find what's causing it.

In the meantime it should be safe to UPDATE directly the url column of the resource and resource_revision tables in the database to use the new host.

@ply112
Copy link
Author

ply112 commented Mar 6, 2018

Thank you for your quick response. I found a real quick fix. On the website, simply click "Update Dataset" without making any changes, and it will refresh the urls for all the resources within that dataset. So, no need to update 1000 files, one at a time.

Questions: Is it true that the resource URLs one sees on the CKAN website do not come from the URL column of the "resource" table? The fact that this URL column is still pointing to development should have no impact on our production server, right?

Thanks again.

@amercader
Copy link
Member

@ply112 So if you click "Update Dataset" the URLs are fixed on the UI but still have the development server in the database?

The URL shown in the UI comes from the url column of the database, but this should only contain the file name for uploaded resources, the rest should be computed using the ckan host in the configuration. As I said it looks like there is a bug somewhere.

@tino097 tino097 self-assigned this Apr 30, 2018
@tino097
Copy link
Member

tino097 commented Jun 12, 2018

After some time debugging i located the following line causes the bug and fully qualified url is stored in the DB instead of just the file name.

@ply112 ply112 closed this as completed Jun 14, 2018
@wardi
Copy link
Contributor

wardi commented Jul 31, 2018

@tino097 I think it's fine to fix this on the dictize (outgoing) code you linked to first. Just strip everything but the last part (file name) of the url stored in the db and regenerate the full link each time. The code that stores the urls can be updated to not save the first part of the url, but that's just to be more tidy it's not the real fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants