Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make some changes to fetch #53

Merged
merged 5 commits into from
Aug 28, 2015
Merged

make some changes to fetch #53

merged 5 commits into from
Aug 28, 2015

Conversation

chadwhitacre
Copy link
Contributor

This is to keep up with structural changes to the spreadsheet.

@chadwhitacre
Copy link
Contributor Author

We started a new spreadsheet and @timothyfcook is copying cleaned-up data over to it (from Excel). I'm having trouble fetching the spreadsheet. I tried "File > Publish to the web ..." but I'm still getting a sign-in page when trying to download the data.

@chadwhitacre
Copy link
Contributor Author

Did I keep notes on this the last time we went through this?

@chadwhitacre
Copy link
Contributor Author

Doesn't look like it: #14 #15. :-/

@chadwhitacre
Copy link
Contributor Author

Here's what I found last time:

Warning: The public visibility does not work for spreadsheets that are made "Public on the web" from the "Visibility options" portion of the sharing dialog of a Google Sheets file. "Published to the web" and "Public on the web" are different ways to share a spreadsheet. We are aware that this is confusing, and will address it in a future version of the API. For now, we hope that this detailed warning prevents confusion.

https://developers.google.com/google-apps/spreadsheets/worksheets

@chadwhitacre
Copy link
Contributor Author

The private visibility can be replaced with the public visibility, which enables the feed to work without authorization for spreadsheets that have been "Published to the Web". The public visibility is supported on the worksheets, list, and cells feeds. The public visibility is useful for accessing the contents of a spreadsheet from the client context of a web page in JavaScript, for example.

@chadwhitacre
Copy link
Contributor Author

Soooooo ... I did publish to web, and the fetch script is still getting the sign-in page. What's up, Google?

@chadwhitacre
Copy link
Contributor Author

/me resists @mention'ing @google. 👀

@chadwhitacre
Copy link
Contributor Author

The link in the Publish to the Web dialog works for me without auth. I can curl it and get csv.

screen shot 2015-08-28 at 11 32 39 am

@chadwhitacre
Copy link
Contributor Author

What's wrong with the url in fetch?

@chadwhitacre
Copy link
Contributor Author

I think it's actually the URL for listing the available worksheets that's failing.

@chadwhitacre
Copy link
Contributor Author

No, it's the actual CSV download that's failing:

(env)$ ./fetch.py 
Fetched 1 worksheet(s).
Problem downloading https://docs.google.com/spreadsheets/d/10PurQxMbALCYNu7I3KfgUb2oMz4Uk5dLPZbTkdNb0ZM/export?gid=1256639907&format=csv ...
They're asking us to sign in. Try 'File > Publish to the web ...'.
(env)$

Is the gid wrong?

@chadwhitacre
Copy link
Contributor Author

Harumph. The individual CSV URLs are given to us by Google in the worksheet listing. We're not computing those ourselves. What gives?

@chadwhitacre
Copy link
Contributor Author

https://docs.google.com/spreadsheets/d/10PurQxMbALCYNu7I3KfgUb2oMz4Uk5dLPZbTkdNb0ZM/pub?output=csv
https://docs.google.com/spreadsheets/d/10PurQxMbALCYNu7I3KfgUb2oMz4Uk5dLPZbTkdNb0ZM/export?gid=1256639907&format=csv

@chadwhitacre
Copy link
Contributor Author

Why was the URL they gave us working before and it's not now?

Are there multiple URLs in the worksheet listing, and is the right one in there for us to grab?

@timothyfcook
Copy link
Contributor

I've had this problem before with pulling data from google sheets....

@chadwhitacre
Copy link
Contributor Author

:-/

@chadwhitacre
Copy link
Contributor Author

I hacked it by extracting the gid from the URL they give us, and constructing an URL that works.

@chadwhitacre
Copy link
Contributor Author

Now I'm working on a bug where there's only one record in the json we dump.

@chadwhitacre
Copy link
Contributor Author

It's because fetch depends on the order of fields to determine what the uid is, and with the uuid column gone I had to update that constant.

@chadwhitacre
Copy link
Contributor Author

@timothyfcook A little more clean-up to be done, it appears:

(env)$ ./fetch.py 
Getting https://spreadsheets.google.com/feeds/worksheets/10PurQxMbALCYNu7I3KfgUb2oMz4Uk5dLPZbTkdNb0ZM/public/basic ...
Getting https://docs.google.com/spreadsheets/d/10PurQxMbALCYNu7I3KfgUb2oMz4Uk5dLPZbTkdNb0ZM/pub?gid=1256639907&single=true&output=csv ...
12 bad uid(s)!
sheet                    field                    value                   
------------------------ ------------------------ ------------------------
storytelling             comes_after              carnegie-science-center-i5-video-competition
storytelling             comes_after              mcg-youth-&-arts-the-wide-world-of-photo
storytelling             comes_after              the-criterion-collection-criterion-collection-top-10s
storytelling             comes_after              vimeo-directing-101     
storytelling             comes_before             carnegie-science-center-i5-video-competition
storytelling             comes_before             mcg-youth-&-arts-the-wide-world-of-photo
storytelling             comes_before             the-criterion-collection-criterion-collection-top-10s
storytelling             comes_before             vimeo-directing-101     
storytelling             uid                      carnegie-science-center-i5-video-competition
storytelling             uid                      mcg-youth-&-arts-the-wide-world-of-photo
storytelling             uid                      the-criterion-collection-criterion-collection-top-10s
storytelling             uid                      vimeo-directing-101     
(env)$

@chadwhitacre
Copy link
Contributor Author

@timothyfcook I've got the script back up to speed with the new sheet. Let's get that data cleaned, and then we can regenerate resources.json and map.svg, and merge this PR.

chadwhitacre added a commit that referenced this pull request Aug 28, 2015
@chadwhitacre chadwhitacre merged commit 1e24f1b into master Aug 28, 2015
@chadwhitacre chadwhitacre deleted the fetch-upgrades branch August 28, 2015 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants