Extract and Visualize location from any file
JavaScript CSS HTML Java Shell Python Other
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
Solr/solr-5.3.1
experiments/evaluation/scripts
geoparser
geoparser_app
.gitignore
LICENSE
README.md
config.txt
db.sqlite3
delete_core.py
docker.zip
logo.png
manage.py
requirements.txt

README.md

GeoParser

The Geoparser is a software tool that can process information from any type of file, extract geographic coordinates, and visualize locations on a map. Users who are interested in seeing a geographical representation of information or data can choose to search for locations using the Geoparser, through a search index or by uploading files from their computer. The Geoparser will parse the files and visualizes cities or latitude-longitude points on the map. After the information is parsed and points are plotted on the map, users are able to filter their results by density, or by searching a key word and applying a "facet" to the parsed information. On the map, users can click on location points to reveal more information about the location and how it is related to their search.

Installation (Docker)

NO NEED TO CLONE THE REPO

1.Install Docker Docker.com

2.Download docker.zip, unzip and cd to docker folder

3.Run "Docker Quickstart Terminal" on your machine

4.Run docker-compose build command inside "docker" folder from step 2

5.After build finished, run docker-compose up command

6.Run docker-machine ip to get your docker ip address and open http://<docker host ip>:8000 on your browser

Installation (manually)

Requirements

-Python 2.7

-pip

-Django

-Apache Tika

Instructions

Install python requirements

pip install -r requirements.txt

How to Run the Application

1.Run Solr

Change directory to where you cloned the project

cd Solr/solr-5.3.1/
./bin/solr start

2.Clone lucene-geo-gazetteer repo

git clone https://github.com/chrismattmann/lucene-geo-gazetteer.git
cd lucene-geo-gazetteer
mvn install assembly:assembly
add lucene-geo-gazetteer/src/main/bin to your PATH environment variable

make sure it is working

lucene-geo-gazetteer --help
usage: lucene-geo-gazetteer
 -b,--build <gazetteer file>           The Path to the Geonames
                                       allCountries.txt
 -h,--help                             Print this message.
 -i,--index <directoryPath>            The path to the Lucene index
                                       directory to either create or read
 -s,--search <set of location names>   Location names to search the
                                       Gazetteer for

3.You will now need to build a Gazetteer using the Geonames.org dataset. (1.2 GB)

cd lucene-geo-gazetteer
curl -O http://download.geonames.org/export/dump/allCountries.zip
unzip allCountries.zip
lucene-geo-gazetteer -i geoIndex -b allCountries.txt

make sure it is working

lucene-geo-gazetteer -s Pasadena Texas
[
{"Texas" : [
"Texas",
"-91.92139",
"18.05333"
]},
{"Pasadena" : [
"Pasadena",
"-74.06446",
"4.6964"
]}
]

Now start lucene-geo-gazetteer server

lucene-geo-gazetteer -server

4.Run tika server as mentioned in https://wiki.apache.org/tika/GeoTopicParser on port 8001. Port can be configured via config.txt

5.MAKE SURE YOU ARE ABLE TO EXTRACT LOCATIONS FROM TIKA SERVER

curl -T /path/to/polar.geot -H "Content-Disposition: attachment; filename=polar.geot" http://localhost:8001/rmeta

You can obtain file here

Output should be this

[
   {
      "Content-Type":"application/geotopic",
      "Geographic_LATITUDE":"39.76",
      "Geographic_LONGITUDE":"-98.5",
      "Geographic_NAME":"United States",
      "Optional_LATITUDE1":"27.33931",
      "Optional_LONGITUDE1":"-108.60288",
      "Optional_NAME1":"China",
      "X-Parsed-By":[
         "org.apache.tika.parser.DefaultParser",
         "org.apache.tika.parser.geo.topic.GeoParser"
      ],
      "X-TIKA:parse_time_millis":"1634",
      "resourceName":"polar.geot"
   }
]

6.Run Django server

python manage.py runserver

7.Open in browser http://localhost:8000/

Note : Please refer to the wiki page on this github repository which can act as a guide for you on how to use GeoParser.

Technologies we Use