Skip to content

h4ck3rm1k3/scraperwiki_local

 
 

Repository files navigation

Local ScraperWiki Python Library

This library aims to be a drop-in replacement for the Python scraperwiki library for use locally. That is, functions will work the same way, and data will go into a local SQLite database; a targeted bombing of ScraperWiki's servers will not stop this local library from working, unless you happen to be running it on one of ScraperWiki's servers.

Installing

This will soon be in PyPI, but for now you can just install from the git repository.

Documentation

Read the standard ScraperWiki Python library's documentation, then look below for some quirks about the local version.

Quirks

The local library aims to be a drop-in replacement. In reality, the local version sometimes works better, though not all of the features have been implemented.

Differences

Datastore differences

The local scraperwiki.sqlite is powered by DumpTruck, so some things work a bit differently.

Data are stored to a local sqlite database named scraperwiki.db.

Bizarre table and column names are supported.

Dates and datetimes are stored in a different standard format.

scraperwiki.sqlite.execute returns lists of dictionaries.

scraperwiki.sqlite.attach downloads the whole datastore from ScraperWiki, So you might not want to use this too often on large databases.

scraperwiki.sqlite.get_var and scraperwiki.sqlite.save_var store their data in the table _dumptruckvars, and they use a slightly different format.

Other Differences

Status of implementation

In general, features that have not been implemented raise a NotImplementedError.

Datastore

scraperwiki.sqlite is missing the following features.

  • Data argument to scraperwiki.sqlite.select
  • All of the verbose keyword arguments (These control what is printed on the ScraperWiki code editor)
  • scraperwiki.sqlite.show_tables only works for the main database is implemented.

Geo

The UK geocoding helpers have not been implemented

  • scraperwiki.geo

Utils

scraperwiki.utils is implemented, as well as the following functions.

  • scraperwiki.log
  • scraperwiki.scrape
  • scraperwiki.pdftoxml
  • scraperwiki.swimport

Deprecated

These submodules are deprecated and thus will not be implemented.

  • scraperwiki.apiwrapper
  • scraperwiki.datastore
  • scraperwiki.jsqlite
  • scraperwiki.metadata
  • scraperwiki.newsql

About

Offline version of scraperlibs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published