Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upMissing Geo Codes #2
Comments
|
Thanks for finding these. It took a while to get decent coordinates for a few dozen houses. Please double check that these were merged in correctly. I changed the PIDs in the geo data set but not the raw; Did you notice any erroneous values in the data when you worked with it on Kaggle? I see two properties that are always(wildly) mis-predicted in the demo models that I've run. I'll target sending this to CRAN at the end of the week, so let me know if you see anything wrong with the updates for these two issues before then. |
|
I will have a look at the merged data later today. Bruce Hoppe tipped me on the data download link on Ames Assessors Office webiste which makes it possible to download 20k+ records at once. The houses that I most struggled with were also the ones Prof De Cock described in the notes:
I did a little bit of research on Kaggle regarding those five houses (three "partial sale" houses in Edwards and two upscale in Northridge):
If you decide to drop these five properties, I will feel no sorrow. Everything else is kind of precious and I would make an effort to keep it. |
|
Those are the ones. I'll keep them in since they are a good example of data investigation (hopefully students will be able to identify them).
That would have saved me some time! |
|
The join seems fine. I geo-coded the remainder of the data points The following locations are interpolated from neighboring PIDs and verified on Google. tribble(~PID, ~newPID, ~Longitude, ~Latitude,
"0902477120", NA, -93.605207, 42.023218, # empty lot, 505 E Lincoln Way, Ames, IA
"0902477130", NA, -93.605421, 42.023222 # empty lot, 509 E Lincoln Way, Ames, IA
"0912251110", NA, -93.588227, 42.018300, # empty lot, 412 Freel Drive, Ames, IA.
"0904101170", NA, -93.657031, 42.031281, # empty lot 1015 N Hyland Ave
"0909201110", NA, -93.647024, 42.019272, # now condos 2309 Knapp St, Ames,
) |
|
Pretty impressive! Just added them. |
|
I started using
|
I think geo location matching is awesome feature, as it allows cross referencing of this dataset against census and other cool things. I tried to find some more matches using Property Search Form and found that in certain cases authorities adjust PID somewhat to accommodate new version of the record (at least this is what I understood is happening).
In couple of instances I was able to match the observations in the dataset to the records using Lot Area, Total Living Area, Year Built and other variables. For that Comparables Search Form was useful.
Here's a result of my efforts:
Longitude and Latitude are geo-coded using google maps. Hope you will find these useful and include in the new version of the package.