This is a script that pulls down information on the New York Homes and Community Renewal's list of community housing based organizations that are currently stuck in a not so useful HTML table that contains links to the organization details such as contact info, website, service area, etc.
These orgs perform work related to affordable housing, community development and civil legal services in New York State and New York City.
This was originally coded as a scraper that runs on Morph but has since been altered. To get started with Morph see the documentation.
Requires Python with pip and Node JS with npm.
First, in terminal cd
to the folder containing this repo.
To grab python dependencies do:
pip install python-requirements.txt
To grab Node JS dependencies do:
npm install
Do python scraper.py
to create the json file of the HCR organization list and pull down the organization details from the HCR's server. This will take a while.
Then do node join_json.js > data_joined.json
to join the JSON files of organization list and organization details and output them to a joined JSON file.
I manually converted the data from JSON to CSV format using an web data converter.
Note: there are more organizations listed in the organization details JSON file. I believe this has to do with the HCR listing organizations elsewhere on their website, some of these may relate to housing and community development but for whatever reason are not listed on the original HTML table.