This is a community-supported Bobik SDK for web scraping in Java.
Include bobik-1.0.jar
located in the lib
directory.
If you are scraping from an Android application, this is enough.
If you are using a vanilla Java environment, you might need to include HttpComponents
and an org.json
packages (see http://usebobik.com/sdk).
Here's a quick example to get you started.
BobikClient bobik = new BobikClient("YOUR_AUTH_KEY");
JSONObject request = new JSONObject();
for (String url : new String[]{"amazon.com", "google.com"})
request.accumulate("urls", url);
for (String query : new String[]{"//a/@href", "return $('.logo').length"})
request.accumulate("queries", query);
Job job = bobik.scrape(request, new JobListener() {
public void onSuccess(JSONObject scraped_data) {
System.out.println("Received data: " + scraped_data);
}
public void onProgress(float currentProgress) {
System.out.println("Current progress is " + currentProgress*100 + "%");
}
public void onErrors(Collection<String> errors){
for (String s : errors)
System.err.println("Error for job " + job.id() + ": " + s);
}
});
Full API reference is available at http://usebobik.com/sdk/java
- Write to support@usebobik.com to become a collaborator.
- The SDK source is fully contained within the
bobik.jar
directory. - Latest compiled jar goes to
lib
- Javadoc goes to
docs
- A sample test application (admittedly, very primitive) is in
sample_app
Submit them here on GitHub: https://github.com/emirkin/bobik_java_sdk/issues