Skip to content

phyous/api-web-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Java Webscraper API example

This project is an example of how one can use modern java libraries to quickly & easily create a web api for any existing 3rd party web page (Similar to what sites like kimonolabs and import.io do).

Specifically, we're going to create a web service that scrapes techmeme.com to serve up tech headlines as json from any point in time.

Libraries used

  • Immutables --- Used to create our models. By using a handful of very powerful annotations along with code generation, we'll be able to create immutable objects and builders to represent our data models.
  • Jsoup --- Used for retrieving html and parsing it. This is an older library that has stood the test of time. Simply pass in css selectors to get the relevant html sections needed.
  • Pippo --- Used as our web framework. This is a relatively new web framework for java that combines a very simple interface with a minimal footprint and a high degree of customizability. Reminds me of a Dropwizard with a simpler interface.
  • Java 8 --- We'll be making use of Java 8 streams and optionals to process the incoming data.

Setup Instructions

  1. Be sure to have java 8 & maven installed
  2. Compile the source code: mvn package
  3. Run the server: java -jar target/apiweb-1.0-SNAPSHOT.jar
  4. Make a request to the server (I've set the server port to 8081 in application.properties).
    • Let's get the tech headlines form new years day in 2015: http://localhost:8081/headlines?date=2015-01-01
    • Let's get the tech headlines for today: http://localhost:8081/headlines

Sample output:

Request: http://localhost:8081/headlines?date=2015-01-01

[
  {
    reporter: "Sarah Frier",
    source: "Bloomberg",
    title: "Snapchat raises $485.6M at $10B+ valuation from 23 investors",
    summary: "  —  Snapchat Raises $485.6 Million to Close Out Big Fundraising Year  —  Snapchat Inc., among a pack of elite technology startups that has attained a valuation of $10 billion or more, capped the year with a filing that disclosed it raised $485.6 million.",
    url: "http://www.bloomberg.com/news/2015-01-01/snapchat-raises-485-6-million-to-close-out-big-fundraising-year.html"
  },
  {
    reporter: "William Turton",
    source: "The Daily Dot",
    title: "U.K. police allegedly arrest Lizard Squad hacker",
    summary: "… Lizard Squad took credit for the Dec. 25 distributed denial-of-service (DDoS) attacks against the PlayStation Network and Xbox Live.  DDoS attacks overwhelm a network with too much traffic, leaving targeted networks inaccessible for legitimate users.",
    url: "http://www.dailydot.com/crime/lizard-squad-vinnie-omari-arrested/"
  }
]

About

Sample code for turning a website into a json api using modern Java

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages