# Example Notebook

This notebook is for collaborators, so that they may get up and running with the project repository.

---

#### Quickstart

1. Clone the repository to desired location on server/workstation: `git clone xxx:xxx`
2. Use `make` to render our Python virtual environment and create your user config: `make 00-environment`. Follow the prompts to enter your information.
3. The data is already uploaded to our Database, but if we wanted to recreate our import and filtering, we could run: `make 01-data`

---

#### Virtual Environment

For consistency, we have a standard Python virtual environment. This environment is isolated from server wide Python installation and comes loaded with several handy packages. The complete package list is tracked by Git and contained in `environment.yml`.

In [1]:
## Libraries

# Data processing
import numpy as np
import pandas as pd

# Natural Language Processing
import nltk
import textblob

# Visualization
import matplotlib as mpl
import seaborn as sns

# PostgreSQL Database
import psycopg2 as pg
import sqlalchemy as sa

# Helper libraries
# easy command line applications
import click
# Environmental variables
import dotenv

In [2]:
## Versions
print ('NumPy:\t\t{}'.format(np.__version__))
print ('Pandas:\t\t{}'.format(pd.__version__))

print ('NLTK:\t\t{}'.format(nltk.version_info))
print ('TextBlob:\t{}'.format(textblob.__version__))

print ('Matplotlib:\t{}'.format(mpl.__version__))
print ('Seaborn:\t{}'.format(sns.__version__))

print ('Psycopg2:\t{}'.format(pg.__version__))
print ('SQLAlchemy:\t{}'.format(sa.__version__))

print ('Click:\t\t{}'.format(click.__version__))
print ('Dotenv:\t\t{}'.format('???'))

NumPy:		1.14.2
Pandas:		0.22.0
NLTK:		sys.version_info(major=3, minor=6, micro=5, releaselevel='final', serial=0)
TextBlob:	0.15.1
Matplotlib:	2.2.2
Seaborn:	0.8.1
Psycopg2:	2.7.4 (dt dec pq3 ext lo64)
SQLAlchemy:	1.2.6
Click:		6.7
Dotenv:		???


---

#### User Environments

In addition to the standard virtual environment, we have a per-user `.env` file storing
- database name
- username
- password
- database host
- database port
- data directory
- NLTK data directory

This `.env` is required to interact with the database via python. We use `.env` in conjunction with the Python library [dotenv](https://github.com/theskumar/python-dotenv) for easy loading.

*Example:*

> USER=hawkID
>
> DATABASE_URL=...

In [3]:
# Example usage with .env
import os
from dotenv import find_dotenv, load_dotenv

# Find and load .env in one line
load_dotenv(find_dotenv())

# Access the saved variables
var = os.environ.get('VAR')

---

#### Makefile

The holy source of truth.

| Command | Description |
| --- | --- |
| `00-environment` | Creates user environment from the ground up. Install Conda if not found in user's $PATH, before creating a new virtual environment. Finally, creates a `.env` according to user input. Should only be run once after cloning repository. |
| `01-data` | Our data ingestion pipeline. Creates raw and filtered tables for our tweets, then imports them to our database before applying filters and inserting into a new table. This should not be used more than once, as we have too much data to run this in any reasonable time. |
| `clean` | Removes Python temporary files; `.py[co]`, `__pycache__`, and jupyter notebook `.ipynb_checkpoints/`. |

In [6]:
!more ..\Makefile

#################################################################################
# GLOBALS                                                                       #
#################################################################################

# Project
PROFILE = default
PROJECT_NAME = immigration-sentiment
PROJECT_DIR := $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST))))

# Python
PYTHON_INTERPRETER = python3
CONDA_HOME = $(HOME)/minicond3
CONDA_BIN_DIR = $(CONDA_HOME)/bin
CONDA = $(CONDA_BIN_DIR)/conda
CONDA_INSTALLER = miniconda_linux-x86_64.sh

# Environment
ENV_DIR = $(CONDA_HOME)/envs/$(PROJECT_NAME)
ENV_BIN_DIR = $(ENV_DIR)/bin
ENV_LIB_DIR = $(ENV_DIR)/lib
ENV_PYTHON = $(ENV_BIN_DIR)/python

#################################################################################
# COMMANDS                                                                      #
#################################################################################

## Sets up our project for a new us