ML4

Scraping Data

In order to re-collect raw data you need to have the framework scrapy installed (http://scrapy.org/). Once that is done replace the items.py and spiders.py files with the ones provided in this repo. There are 4 working spider(DuProp, DuPropSold, RoyalHouse, RoyalCondo). Run "scrapy crawl [SPIDER NAME]" to collect the data (use options to save to csv).

Clean Data + Merge with Open Data

Once collected, run the script "datacleanup.py" (change lines 50,61 as appropriate) in the root directory and after placing the outputed file in /data/data run "add_open_data.py" (change line 270 as appropriate). The final outputed dataset is the one to be fed to the algorithms! It is required that all the required open data files exist and are in the same directory as this script.

Experimentation

Use the "missing_data.py" script found in the root directory.Everything will run as part of the main method.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
Paper		Paper
data		data
.gitignore		.gitignore
Dec1Data.csv		Dec1Data.csv
README.md		README.md
README.md~		README.md~
datacleanup.py		datacleanup.py
datacleanup.py~		datacleanup.py~
experiments.py		experiments.py
final_data_27.csv		final_data_27.csv
final_data_fixed-cleanDec - Copy.csv		final_data_fixed-cleanDec - Copy.csv
final_data_fixed-cleanDec.csv		final_data_fixed-cleanDec.csv
final_data_fixed-cleanDec.csv~		final_data_fixed-cleanDec.csv~
items.py		items.py
make_graphs.py		make_graphs.py
make_graphs.py~		make_graphs.py~
missing_data.py		missing_data.py
missing_data.py~		missing_data.py~
missingdata.py		missingdata.py
missingdata.py~		missingdata.py~
spider.py		spider.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML4

Scraping Data

Clean Data + Merge with Open Data

Experimentation

About

Releases

Packages

Contributors 2

Languages

noseworm/ML4

Folders and files

Latest commit

History

Repository files navigation

ML4

Scraping Data

Clean Data + Merge with Open Data

Experimentation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages