Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Good sources of messy addresses #16

Closed
fgregg opened this issue Aug 8, 2014 · 7 comments
Closed

Good sources of messy addresses #16

fgregg opened this issue Aug 8, 2014 · 7 comments
Labels

Comments

@fgregg
Copy link
Member

fgregg commented Aug 8, 2014

@fgregg
Copy link
Member Author

fgregg commented Sep 2, 2014

@waldoj, could you send us a sample of the addresses in the lobbying data you were working with?

@fgregg
Copy link
Member Author

fgregg commented Sep 2, 2014

@jernsthausen, can you send us a sample of the addresses you were looking to parse?

@jernsthausen
Copy link

@fgregg give me a moment to find a dataset with some completely unparsed addresses.

@waldoj
Copy link

waldoj commented Sep 3, 2014

I have even more useful messy address data on hand, also from the state of Virginia. You can find it in most of the files at Virginia Businesses. I've taken all of the limited partnership addresses from their CSV file and posted it in a gist, which I hope you'll find helpful.

@fgregg
Copy link
Member Author

fgregg commented Sep 3, 2014

Ooooh! Thanks!

On Wed, Sep 3, 2014 at 11:13 AM, Waldo Jaquith notifications@github.com
wrote:

I have even more useful messy address data on hand, also from the state of
Virginia. You can find it in most of the files at Virginia Businesses
http://business.openva.com/. I've taken all of the limited partnership
addresses from their CSV file http://business.openva.com/3_lp.csv and posted
it in a gist https://gist.github.com/waldoj/be5dca32a99835a9066f, which
I hope you'll find helpful.


Reply to this email directly or view it on GitHub
#16 (comment)
.

773.888.2718
2231 N. Monticello Ave
Chicago, IL 60647

@fgregg fgregg added the training label Sep 5, 2014
@fgregg
Copy link
Member Author

fgregg commented Sep 6, 2014

@jernsthausen, @waldoj Thanks for the very messy addresses. I think it's at a point where you guys might want to start playing with it.

Follow the instructions in the readme, and give it a spin. I think you'll get better results if you feed the data in column by column (i.e. don't concatenate address_1 and address_2 into a single string, but do each separately)

@waldoj
Copy link

waldoj commented Sep 6, 2014

Will do! Thank you, @fgregg!

@fgregg fgregg closed this as completed Mar 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants