Scrape GSoC organisations using a single script.
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
img Updates README Dec 17, 2016
resources Update 2017.txt Jan 4, 2019
.gitignore
.travis.yml [QA] Enforce flake8 and isort checks on travis (#24) Dec 23, 2017
LICENSE Initial commit Dec 1, 2016
README.md Updated Readme to use Python 2.7 (#37) Jan 22, 2019
_config.yml Set theme jekyll-theme-cayman Dec 17, 2016
requirements-test.txt [QA] Enforce flake8 and isort checks on travis (#24) Dec 23, 2017
requirements.txt Update requirements.txt Jan 4, 2019
scrape.py Fix lint warnings Jan 4, 2019
utils.py Fix lint warnings Jan 4, 2019

README.md

GSoC Organisation Scraper

Makes life easier by scraping instead of searching for each and every organisation by name. Also shows number of times that an organisation has appeared in GSoC. Used Requests library of python and BeautifulSoup

Use Python-2.7

Requirements :

  • BeautifulSoup
  • Requests

Instructions :

# Clone this repository
git clone https://github.com/rohithasrk/GSoC-Organisation-Scraper.git

# Go into the repository
cd GSoC-Organisation-Scraper

# Install dependencies
[sudo] pip2 install -r requirements.txt

# Run the app without giving technology as a command line argument 
python2 scrape.py

# Enter the technology of preference when prompted.
# Example: python

# Run the app by giving technology as a command line argument 
python2 scrape.py javascript

#To store the output to a text file use pipe
python2 scrape.py ruby > ruby_orgs

Screenshots :

When browsed for javascript and ruby, some of the results are as shown below.

Python orgs 1

Python orgs 2

TODOs :

  • Make the code run faster.
  • Remove multiple results.

Contributing :

  • Fork the repo.
  • Create a new branch named <your_feature>
  • Commit changes and make a PR.
  • PRs are welcome.

This program uses PyTerm-Colors : https://github.com/vinamarora8/PyTerm-Colors.git