Skip to content
This repository has been archived by the owner on Mar 1, 2018. It is now read-only.

Commit

Permalink
Add Dockerfile
Browse files Browse the repository at this point in the history
  • Loading branch information
cuducos committed Dec 15, 2016
1 parent 2a153c6 commit 9c23eaf
Show file tree
Hide file tree
Showing 3 changed files with 40 additions and 7 deletions.
20 changes: 20 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM jupyter/datascience-notebook:latest
USER root
RUN apt-get update && apt-get install -y \
build-essential \
libxml2-dev \
libxslt1-dev \
python3-dev \
unzip \
zlib1g-dev

USER jovyan
RUN pip install --upgrade pip
COPY requirements.txt ./
COPY setup ./
RUN ./setup

COPY rosie.py ./
COPY rosie ./rosie
VOLUME /tmp/serenata-data:/tmp/serenata-data
CMD python rosie.py run
25 changes: 19 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
[![Code Climate](https://codeclimate.com/github/datasciencebr/rosie/badges/gpa.svg)](https://codeclimate.com/github/datasciencebr/rosie)
[![Coverage Status](https://coveralls.io/repos/github/datasciencebr/rosie/badge.svg?branch=master)](https://coveralls.io/github/datasciencebr/rosie?branch=master)

A Python application reading receipts from the Quota for Exercising Parliamentary Activity (aka CEAP, from the Brazilian Chamber of Deputies) and outputs, for each of the receipts, a "probability of corruption" and a list of reasons why is considered this way.
A Python application reading receipts from the [Quota for Exercising Parliamentary Activity](https://github.com/datasciencebr/serenata-de-amor/blob/master/CONTRIBUTING.md#more-about-the-quota-for-exercising-parliamentary-activity-ceap) (aka CEAP) from the Brazilian Chamber of Deputies and outputs, for each of the receipts, a _probability of corruption_ and a list of reasons why it was considered this way.

- [x] Fetch CEAP dataset from Chamber of Deputies
- [x] Convert XML to CSV
Expand All @@ -16,17 +16,30 @@ A Python application reading receipts from the Quota for Exercising Parliamentar
- [ ] Machine Learning models using scikit-learn
- [ ] Task to push to Jarbas via API

## Setup
## Running

### With Docker

```console
$ docker build -t rosie .
$ docker run rosie

```

Then check your `/tmp/serenata-data/` directory in you host machine for `irregularities.xz`.

### Without Docker

#### Setup

```console
$ cd rosie
$ conda update conda
$ conda create --name serenata_rosie python=3
$ source activate serenata_rosie
$ ./setup
```

## Running
#### Running

```console
$ python rosie.py run
Expand All @@ -40,8 +53,8 @@ Also a target directory (where files are saved) can de passed 鈥斅爁or example:
$ python rosie.py run /my/serenata/directory/
```

## Test suite
#### Test suite

```console
$ python rosie.py test
```
```
2 changes: 1 addition & 1 deletion setup
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

import pip

pip.main(['install', '--upgrade', '-r', 'requirements.txt', '--src', 'rosie'])
pip.main(['install', '--upgrade', '-r', 'requirements.txt'])

0 comments on commit 9c23eaf

Please sign in to comment.