Skip to content

Latest commit

 

History

History
366 lines (235 loc) · 10.1 KB

DEPLOYMENT.adoc

File metadata and controls

366 lines (235 loc) · 10.1 KB

Deployment

These instructions walk through installation of dependencies on a standard Ubuntu server. Installation for other Linux flavors will be similar.

Basic server set-up and log-in:

  • Ensure that http, https and ssh ports are available.

  • Store your private key (e.g. mykey.pem) and ensure it is readable (chmod 400 mykey.pem)

  • Log in to the server from a terminal with ssh (e.g. ssh -F /dev/null -i ./mykey.pem ubuntu@ec…​.compute-1.amazonaws.com

Update and upgrade server packages

sudo apt-get update sudo apt-get upgrade

Mount additional EBS volume on /data

Create flatgov directory

$ sudo mkdir /opt/flatgov
$ sudo chown ubuntu:ubuntu /opt/flatgov

Install Python, pyenv and python dependencies

  • Install pyenv dependencies

$ sudo add-apt-repository "deb http://archive.ubuntu.com/ubuntu $(lsb_release -sc) main universe"
$ sudo apt-get update
$ sudo apt-get install -y make build-essential libssl-dev zlib1g-dev \
 libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev\
 libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python-openssl\ libpcre3 libpcre3-dev
 $ sudo apt-get upgrade git
  • Install pyenv:

$ git clone https://github.com/pyenv/pyenv.git ~/.pyenv

Then:

echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo -e 'if command -v pyenv 1>/dev/null 2>&1; then\n eval "$(pyenv init -)"\nfi' >> ~/.bashrc

and source ~/.bashrc

Load .bashrc from .bash_profile for interactive login

echo '# The following sources ~/.bashrc in the interactive login case,' >> ~/.bashrc
echo '# because .bashrc is not sourced for interactive login shells:' >> ~/.bashrc
echo 'case "$-" in *i*) if [ -r ~/.bashrc ]; then . ~/.bashrc; fi;; esac' >> ~/.bashrc
  • Verify pyenv installation

pyenv install --list

  • Install pyenv virtualenv

$ git clone https://github.com/pyenv/pyenv-virtualenv.git $(pyenv root)/plugins/pyenv-virtualenv
  • Auto-activate virtualenvs

echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bash_profile

  • Restart shell to install pyenv virtualenv

$ exec "$SHELL"

  • Install Python 3.8.3

$ pyenv install 3.8.3

  • Create the flatgov virtualenv

pyenv virtualenv 3.8.3 flatgov

  • Upgrade pip

pip install --upgrade pip

Clone Flatgov repository

$ cd /opt/flatgov
$ git clone https://github.com/aih/FlatGov.git

Copy environment file. Edit the value of SECRET_Key and add a value for the PROPUBLICA_CONGRESS_API_KEY

:

$ cp /opt/flatgov/FlatGov/server_py/flatgov/.env-sample /opt/flatgov/FlatGov/server_py/flatgov/.env

Collect static documents

$ pyenv activate flatgov
$ cd /opt/flatgov/FlatGov/server_py/flatgov/
$ python manage.py collectstatic

Copy data

The data generated by this repository’s scripts resides in a congress directory. This is compressed with tar -czf congress.tar.gz congress and stored in an S3 bucket in AWS.

To download and unzip this data:

These provide access to the s3://flatgov bucket.

$ aws configure # See documentation at the aws link above

  • Copy the data from AWS

    • Copy congress directory with processed data to /opt/flatgov/FlatGov/server_py/congress

$ nohup aws cp s3://flatgov/congress.tar.gz /opt/flatgov/congress.tar.gz &

Note
This is copying ~4GB of data. It may take an hour or more (which is why it should be done in the background).
  • Uncompress the congress directory

$ cd /opt/flatgov $ nohup tar xvzf congress.tar.gz &

Note
The uncompressing may also take some time.
  • Symlink the congress directory so it can be accessed in the Django app

$ ln -s /opt/flatgov/congress /opt/flatgov/FlatGov/flatgov/server_py/congress

Install flatgov dependencies

Ensure that the flatgov virtual environment is activated: pyenv activate flatgov

From the top level of this repository:

$ cd /opt/flatgov/FlatGov
$ pip install -e ./
# These requirements should be added to setup.py
# $ pip install -r requirements.txt
# $ pip install -r server_py/requirements.txt
Note
to install psycopg2 on ubuntu, I had to sudo apt-get install --reinstall libpq-dev. See https://stackoverflow.com/questions/58961043/how-to-install-libpq-fe-h#

Install Postgresql

sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt-get update
sudo apt-get -y install postgresql

Install db, run migrations and add data

Follow the instructions in django_database.adoc

Install Elasticsearch (for similarity data)

  • Download and install public signing key

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

  • Install Apt https transport

sudo apt-get install apt-transport-https

  • Save repository definition

echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-7.x.list

  • Install elasticsearch debian package

sudo apt-get update && sudo apt-get install elasticsearch

  • Configure systemd to start Elasticsearch

sudo /bin/systemctl daemon-reload
sudo /bin/systemctl enable elasticsearch.service
  • Start Elasticsearch

sudo systemctl start elasticsearch.service

  • Create the billsections index

In Python, with the flatgov virtual environment, run createIndex(delete=True) from https://github.com/aih/FlatGov/blob/master/flatgovtools/elastic_load.py. This will create the index with the correct mappings.

Install Elasticdump to restore and backup data (requires NodeJS and NPM)

  • Install nvm and LTS version of NodeJS (apt version is quite old)

$ curl -sL https://raw.githubusercontent.com/creationix/nvm/v0.35.3/install.sh -o install_nvm.sh
$ bash install_nvm.sh
$ nvm install v14.15.0
$ nvm use  v14.15.0
$ nvm alias default  v14.15.0
  • Install elasticdump globally

npm install elasticdump -g

Restore index data
  • Unzip data

gzip -d elasticdump.billsections.json.gz

  • Load data to Elasticsearch

nohup \
elasticdump \
  --input="${file_name}.json" \
  --output=http://localhost:9200/billsections \
  --limit=1 \
  --fileSize=100kb &

nohup elasticdump --input="elasticdump.billsections.json" --output=http://localhost:9200/billsections --fileSize=100kb --limit=10

The limit and fileSize options slow loading into the index, but prevent Elasticsearch from crashing due to memory limits.

Note
To avoid crashes with Elasticsearch, it may be helpful to add swap memory. See https://linuxize.com/post/how-to-add-swap-space-on-ubuntu-18-04/

Install and Configure Nginx

  • Install Nginx

$sudo apt-get install -y nginx

  • Copy Nginx configuration into /etc/nginx/sites-available/

$ sudo cp /opt/flatgov/FlatGov/server_py/flatgov_nginx.conf /etc/nginx/sites-available/flatgov_nginx.conf
  • Symlink to this file from /etc/nginx/sites-enabled so nginx can see it:

$ sudo ln -s /etc/nginx/sites-available/flatgov_nginx.conf /etc/nginx/sites-enabled/

  • Start Nginx

$ sudo systemctl start nginx

Serve with a wsgi server

Using uwsgi

To run from command-line:

$ cd /opt/flatgov/FlatGov/server_py/flatgov
$ uwsgi --ini flatgov_uwsgi.ini # the --ini option is used to specify the ini file where uwsgi settings are defined

Once the Emperor mode directory is created (see https://uwsgi-docs.readthedocs.io/en/latest/tutorials/Django_and_nginx.html#emperor-mode), it is possible to run: nohup uwsgi --emperor /etc/uwsgi/vassals --uid www-data --gid www-data &

Then whenever there are changes to the Django directory, uwsgi will reload.

  • Restart Nginx

$ sudo systemctl restart nginx

TODO: set deployment to 'production' (i.e. remove debug info)

Using waitress (compatible with Windows)

The waitress server will already be installed in your pyenv environment from requirements.txt. The server_py/server.py file can be used to serve the app from the command line with python server.py (within the flatgov pyenv environment).

Add other SSH users

  • User generates a keypair:

$ ssh-keygen
  • User then sends .pub file to admin (not the other file)

  • Admin copies .pub file to server:

$ echo "put /path/to/mykey.pub" | sftp -i /path/to/admin/key.pem admin-username@ec2-address.amazonaws.com
  • Admin sets up new user (david) on server:

$ sudo adduser --disabled-password david
$ sudo mkdir /home/david/.ssh
$ sudo cp mykey.pub /home/david/.ssh/authorized_keys
$ sudo chown -R david:david /home/david/.ssh
$ sudo chmod 700 /home/david/.ssh
$ sudo chmod 600 /home/david/.ssh/authorized_keys
  • Now user can log in to server:

$ ssh -i /path/to/mykey david@ec2-address.amazonaws.com

Provide sudoer access (only) to those who need it [CAUTION]

Warning
This can be insecure. Make sure that anyone who has sudo access really needs it and will act responsibly.

Change password defaults in the sudoers file:

$ sudo visudo

This opens a vi instance on /etc/sudoers. Find the line reading:

## Allows people in group wheel to run all commands
%wheel        ALL=(ALL)       ALL

Comment that out. Below that, you will find the line:

## Same thing without a password
# %wheel  ALL=(ALL)       NOPASSWD: ALL

Uncomment this option, then save and quit vi.

Finally, to provide sudo access to the new user:

$ sudo usermod -aG wheel USERNAME