Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web Scraping Script? #1

Closed
Camiann opened this issue Aug 1, 2018 · 3 comments
Closed

Web Scraping Script? #1

Camiann opened this issue Aug 1, 2018 · 3 comments
Assignees

Comments

@Camiann
Copy link

Camiann commented Aug 1, 2018

@Nishnha This is great work. I was wondering if you could post the script you used to scape the public comment submissions and attachments from regulations.gov.

@Nishnha
Copy link
Owner

Nishnha commented Aug 3, 2018

Hey @Camiann, I do have the script saved somewhere on another machine, but I won't be able to access it for a couple weeks.
I'll keep this issue open as a reminder to add it to the repo.

In the meantime, it may do you well to look into doing it yourself with webscraper.io. That's the tool I used. It's intuitive and it gets past the Javascript loading wall on the regulations.gov website.

@Nishnha Nishnha self-assigned this Aug 3, 2018
@Camiann
Copy link
Author

Camiann commented Aug 12, 2018

Thanks!

@Nishnha
Copy link
Owner

Nishnha commented Sep 5, 2018

@Camiann,

Here is a Gist of the webscraper.io sitemap export that I used for web scraping the regulations.gov comments.

I had to remake the script, so it may differ slightly from the original one in terms of naming; functionality should be identical.

Hope this helps!

@Nishnha Nishnha closed this as completed Oct 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants