Skip to content

Latest commit

 

History

History
64 lines (50 loc) · 3.35 KB

README.md

File metadata and controls

64 lines (50 loc) · 3.35 KB

                                        Java Maven Build Status Release Status GitHub Issues License

Basic Overview

Download articles and legal documents from public procurement sources:

  • Tender and contract data of European government bodies from OpenOpps via API or Amazon-S3 bucket (credentials are required)
  • Legislative texts via JRC-Acquis dataset.
  • Public procurement notices via TED dataset.

And index them into SOLR to perform complex queries and visualize results through Banana.

Quick Start

  1. Install Docker and Docker-Compose

  2. Clone this repo

    git clone https://github.com/TBFY/harvester.git
    
  3. Move into src/test/docker directory.

  4. Run Solr and Banana by: docker-compose up -d

  5. You should be able to monitor the progress by: docker-compose logs -f

  6. A Solr Admin site should be available at: http://localhost:8983/solr

  7. Rename the configuration file: src/test/resources/credentials.properties.sample to src/test/resources/credentials.properties (if you have credentials, update its content)

  8. Download and extract TED articles from ftp://guest:guest@ted.europa.eu/daily-packages/ and save them at: input/ted

  9. Move into base directory and run our harvester by: ./test TEDHarvester

  10. A dashboard with results should be available at: http://localhost:8983/solr/banana

Take a look at all our harvesters here: src/test/java/harvest/.

Lastest Stable Release

Step 1. Add the JitPack repository to your build file

        <repositories>
		<repository>
		    <id>jitpack.io</id>
		    <url>https://jitpack.io</url>
		</repository>
	</repositories>

Step 2. Add the dependency

        <dependency>
	    <groupId>com.github.TBFY</groupId>
	    <artifactId>harvester</artifactId>
	    <version>last-stable-release-version</version>
	</dependency>

Contributing

Please take a look at our contributing guidelines if you're interested in helping!