Skip to content
This repository has been archived by the owner on Sep 20, 2024. It is now read-only.

P7 Crawling: Office of Elementary and Secondary Education #9

Closed
6 tasks done
Daniellappv opened this issue Feb 25, 2020 · 0 comments
Closed
6 tasks done

P7 Crawling: Office of Elementary and Secondary Education #9

Daniellappv opened this issue Feb 25, 2020 · 0 comments
Assignees

Comments

@Daniellappv
Copy link

Daniellappv commented Feb 25, 2020

Description: Scrape metadata for https://oese.ed.gov/

Acceptance criteria

  • We have a data dump with all the resources metadata we can get from target site

Task-list:

  • Crawl the site
  • Perfect the crawling to reach as many resources as possible
  • Integrate with the existing pipeline rules (provide a HTML response for the parser)
  • Test run with a dummy parser - it should collect datasets and dump them into JSON files
  • Push the code once it checks all the above criteria

Jira card: https://open-data-ed.atlassian.net/browse/OD-507

@nightsh nightsh changed the title P7 Scraping: Office of Elementary and Secondary Education P7 Crawling: Office of Elementary and Secondary Education Mar 2, 2020
@nightsh nightsh closed this as completed Mar 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants