# Installing wetsuite

Wetsuite as a project consists of a few distinct things, such as 
the website and its guidance,
notebooks that are specific projects or experiments,
and our own library of helpers that most of those notebooks rely on.

This is just about the last one or two:
installation of that library, often so that you can run those notebooks.

...because those notebooks rely on our own library (and a dozen more libraries that ours relies on).

**The short version** is that 
- `pip install -U wetsuite` should install all of those.
- and if we're still working on that library, occasionally re-running that should update it.


## Notes on notebooks

If you use our notebooks, you will note that many start with a cell saying `!pip install -U wetsuite` (and link here)
- This is a reminder to do this install, if you haven't already.
- some ways of running notebooks set up a very specific python environment. Installing from inside a notebook ensures it's installed into the environment you're using
- the added exclamation mark is a notebook way of saying 'this is a command-line command, not python code'

In terms of installation, it matters whether
- you run them on your own infrastructure -- **installing once is fine**, you can mostly forget this afterwards
- you run them on colab -- you get a fresh sandbox each time, **so you always start a session** running these install lines
See also [notebooks_intro](notebooks_intro.ipynb) for more context.


## Notes on spacy

A number of notebooks also use spacy,
and want you to download specific spacy models (which do not seem to be easily installed via python-standard mechanisms),

We will include similar "please run this to install" lines as we use them, such as:

    !python3 -m spacy download nl_core_news_lg
    !python3 -m spacy download en_core_web_lg    
    !python3 -m spacy download xx_sent_ud_sm    


## Notes you can ignore

* you can add `--quiet` to that pip line if you want less output spam

* if and while the project is under development, you may wish to occasionally re-run that so it updates to the latest version
  - you don't have to do that every time you start a notebook, but it shouldn't hurt

* you _may_ wish to work in a python virtual environment 
  - programmers will be nodding that that's how to do it cleanly
  - others may get away with not doing it (it's extra steps and extra thought)

<!--
## On CPU and GPU

Using your main processor is sometimes slower than GPU - but it's surer to work.


(TODO: rewrite)
Various example code defaults to the CPU variant of spacy so that it functions everywhere.

If you have more than a handful of documents, and a GPU, 
then it becomes interesting to use the GPU for the parts that can use it.


This will requires some fiddling at install time.

For plain spacy (see also its documentation) this comes down to figuring out your environment's CUDA version (on linux see `nvcc -V`), then installing with
  pip install spacy[cuda110]
instead of
  pip install spacy


We try to wrap that dependency in our own naming, so do
  pip install wetsuite[spacy-cuda110]
instead of
  pip install wetsuite

TODO: see if/when we can rely on [cuda-autodetect](https://spacy.io/usage) instead.
-->
