Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
achavez committed Feb 7, 2019
0 parents commit 11340c7
Show file tree
Hide file tree
Showing 9 changed files with 667 additions and 0 deletions.
154 changes: 154 additions & 0 deletions .gitignore
@@ -0,0 +1,154 @@

# Created by https://www.gitignore.io/api/python,osx
# Edit at https://www.gitignore.io/?templates=python,osx

### OSX ###
# General
.DS_Store
.AppleDouble
.LSOverride

# Icon must end with two \r
Icon

# Thumbnails
._*

# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent

# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk

### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

### Python Patch ###
.venv/

# End of https://www.gitignore.io/api/python,osx
11 changes: 11 additions & 0 deletions Pipfile
@@ -0,0 +1,11 @@
[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[dev-packages]

[packages]

[requires]
python_version = "3.7"
94 changes: 94 additions & 0 deletions README.md
@@ -0,0 +1,94 @@
# 🏛️ socrata2sql

Plenty of state and local governments use Socrata to run their open data portals. This tool allows you to grab a dataset from one of these portals and copy it into a SQL database of your choice. It uses the Socrata API to understand the columns in the dataset and attempts to create correctly-typed columns in the SQL database to match, including PostGIS geometries if the database and source dataset support them.

## Requirements

- Python 3.x

## Installation

#### Using `pipenv`

1. Add our index to your project's Pipfile:
```ini
[[source]]
name = "dmn"
url = "http://dmn-pypi.s3-website-us-east-1.amazonaws.com/"
verify_ssl = false
```
2. Install directly from our private package index:
```sh
$ pipenv install tec -i dmn
```

#### Using `pip`

1. Install directly from our private package index:
```sh
$ pip install --find-links http://dmn-pypi.s3-website-us-east-1.amazonaws.com/tec/ --trusted-host dmn-pypi.s3-website-us-east-1.amazonaws.com tec
```

#### Locally (for development)

Using this option you'll have the `tec` command line tool _and_ you'll be able to alter the tool's code.

1. Clone the repository to your machine and step into the directory.

2. Install (preferably in a virtual environment) using the included [setup.py](setup.py):
```sh
$ pipenv install -e .
```

Or using `pip`:

```sh
$ pip install -e .
```

## Usage

```
Socrata to SQL database loader
Load a dataset from a Socrata-powered open data portal into a SQL database.
Uses the Socrata API to inspect the dataset, then sets up a table with matching
SQL types and loads all rows. The loader supports any database supported by
SQLalchemy.
Usage:
socrata2sql insert <site> <dataset_id> [-d=<database_url>] [-a=<app_token>] [-t=<table_name>]
socrata2sql ls <site> [-a=<app_token>]
socrata2sql (-h | --help)
socrata2sql (-v | --version)
Options:
<site> The domain for the open data site. Ex: www.dallasopendata.com
<dataset_id> The ID of the dataset on the open data site. This is usually
a few characters, separated by a hyphen, at the end of the
URL. Ex: 64pp-jeba
-d=<database_url> Database connection string for destination database as
dialect+driver://username:password@host:port/database.
Default: sqlite:///<dataset name>.sqlite
-t=<table_name> Destiation table in the database. Defaults to a sanitized
version of the dataset's name on Socrata.
-a=<app_token> App token for the site. Only necessary for high-volume
requests. Default: None
-h --help Show this screen.
-v --version Show version.
Examples:
List all datasets on the Dallas open data portal:
$ socrata2sql ls www.dallasopendata.com
Load the Dallas check register into a local SQLite file (file name chosen
from the dataset name):
$ socrata2sql insert www.dallasopendata.com 64pp-jeba
Load it into a PostgreSQL database call mydb:
$ socrata2sql insert www.dallasopendata.com 64pp-jeba postgresql:///mydb
```

## Copyright

&copy; 2019 The Dallas Morning News
29 changes: 29 additions & 0 deletions setup.py
@@ -0,0 +1,29 @@
from setuptools import setup

from socrata2sql import __version__


setup(
name='socrata2sql',
version=__version__,
description='SQL loader for Socrata data sets',
url='http://github.com/DallasMorningNews/socrata2sql',
author='Andrew Chavez / The Dallas Morning News',
author_email='newsapps@dallasnews.com',
license='MIT',
packages=['socrata2sql'],
install_requires=[
'docopt~=0.6',
'progress~=1.4',
'sodapy~=1.5',
'SQLalchemy~=1.2',
'tabulate~=0.8',
'geoalchemy2~=0.5',
],
entry_points={
'console_scripts': [
'socrata2sql=socrata2sql.cli:main',
],
},
python_requires=">=3"
)
1 change: 1 addition & 0 deletions socrata2sql/__init__.py
@@ -0,0 +1 @@
__version__ = (0, 1, 0)

0 comments on commit 11340c7

Please sign in to comment.