Skip to content

whosonfirst-data/whosonfirst-brands

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

whosonfirst-brands

Brands in Who's On First documents.

Caveats

This is a work in progress and very much still "wet paint" and there is little to no tooling for this stuff yet.

Where do all these #brands come from?

At the moment, they come from the Elasticsearch index running the Who's On First Spelunker. They are the product of a not very sophisticated faceting process on an unanalyzed copy of the wof:name field (called unsuprisingly name_not_analyzed). Like this:

curl -s -v --max-time 600 'http://localhost:9200/spelunker/_search?from=0&size=50' -d '{"query": {"term": {"w:placetype": "venue"}}, "aggregations": {"brands": {"terms": {"field": "name_not_analyzed", "size": 0}}}, "size": 0}' > brands.json

That produces something like 16 million distinct names. We have not imported most of those. Instead we have limited the #brands included here to only those with 50 (or more) venues. So instead of 16 million #brands we have about 7,400 as of this writing. Maybe the cut-off point should be 25, maybe it should be 10. Maybe it should be 5. We don't know yet. We're figuring it out as we go.

It is assumed that a whole bunch of these records will be superseded or deprecated or both. That work remains tomorrow's problem.

About

Brands in Who's On First documents.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published