On AwesomeData community
The AwesomeData community consists primarily, although not solely, of its online presence in mailing lists and activities such as blog postings and comments, the GitHub repository, and so on. The vision of the AwesomeData community is contributing a pure list of
high quality datasets for open communities such as academia, research, education etc.
The following policy is a guideline to propose new data items and maintain existing items with outdated information:
A dataset is considered as
high qualitywhen one or more of the following criteria are met:
- Uncommon to obtain in the open community legally;
- Contributing valuable knowledge for a specific domain;
- Able to be downloaded directly from the linked site, i.e., not barred by login or purchasing;
- No advertisement! No Spam! No reputation promotion!
A new pull request will be merged into the core repository after passing automatic validation and maintainer's review.
An existing dataset item with outdated information (e.g., unavailable site) will be removed after a while without new update.
How to contribute a new data entry
It is simple to contribute to APD:
apd-corerepository into your own namespace such as
Clone your project locally:
git clone https://github.com/yourname/apd-core.git cd apd-core
- Create a new data entry from template
For example, we want create
NEW_DATASET.yml under category folder of
cp PULL_REQUEST_TEMPLATE.yml ./core/Government/NEW_DATASET.yml
Then edit data fields as you want:
For data validation, it requires three essential data fields:
category, while the
category should be the same with the folder name, i.e., "Government" in the example.
In a nutshell, you should get a basic entry like
--- title: New Dataset Name homepage: https://example.com category: Government
- Run local test to validate your modification:
# With python sudo pip install -r tests/requirements.txt ./tests/testing.sh
- Commit local modifications to your repository:
git add ./core/Government/NEW_DATASET.yml git commit -m "Add NEW_DATASET under government" # Any message as you want git push origin master
- Create a new Pull Request to the trunk repository on Github page, usually