# Install DataJoint as a Python package

To use DataJoint, 

- Make sure you have [Python 3](https://www.python.org/downloads/) installed. (Most likely you already do.)
- Then install DataJoint as a python package. In the terminal, we do:  

`pip3 install datajoint --pre`  

In this way, we get the most up-to-date dev version of DataJoint. (Note that in some computers, the command is simply `pip` instead of `pip3` -- either way, what you need is for it to install packages with the Python 3 you are using.)

- In **Windows** (and maybe also Mac) you also want to install `pydotplus` (see instructions [here](https://docs.datajoint.io/python/setup/02-DataJoint-Python-Windows-Install-Guide.html)):

`pip3 install pydotplus`

- In **Mac** you also need to install graphviz, which you can do using the [Homebrew installer](https://brew.sh/). Again from the terminal (not within Python):

`brew install graphviz`

(If that installation doesn't work, note that you might need to first install SVN `brew install svn` before you can install graphviz.)

*You're good to go!*  Start up your Python 3 and start having fun.


# Configuring DataJoint

Before you can start using anything with DataJoint, you need to configure DataJoint. In order for DataJoint to work, you need to tell it informatoin about the database connection, namely the database hostname.

Start by importing the package. Convention is to import it as `dj`.

In [1]:
import datajoint as dj

In [2]:
dj.__version__

'0.12.dev7'

In [3]:
dj.config

{   'connection.charset': '',
    'connection.init_function': None,
    'database.host': 'localhost',
    'database.password': None,
    'database.port': 3306,
    'database.reconnect': True,
    'database.use_tls': None,
    'database.user': None,
    'display.limit': 12,
    'display.show_tuple_count': True,
    'display.width': 14,
    'fetch_format': 'array',
    'loglevel': 'INFO',
    'safemode': True}

Notice that `database.host` is already set to `localhost`. We need to ask DataJoint to talk to the U19 database, we can set that up by:

In [4]:
dj.config['database.host'] = 'datajoint00.pni.princeton.edu'

In [5]:
dj.config

{   'connection.charset': '',
    'connection.init_function': None,
    'database.host': 'datajoint00.pni.princeton.edu',
    'database.password': None,
    'database.port': 3306,
    'database.reconnect': True,
    'database.use_tls': None,
    'database.user': None,
    'display.limit': 12,
    'display.show_tuple_count': True,
    'display.width': 14,
    'fetch_format': 'array',
    'loglevel': 'INFO',
    'safemode': True}

Now we are pointing to the right place, let's try connecting. You can explicitly trigger a connection with `dj.conn()`

In [6]:
dj.conn()

Please enter DataJoint username: shans
Please enter DataJoint password: ········
Connecting shans@datajoint00.pni.princeton.edu:3306


DataJoint connection (connected) shans@datajoint00.pni.princeton.edu:3306

Once you verify that the connection is working, you'd want to save the configuration so that you don't have to keep on changing the `database.host` everytime you work with DataJoint. Simply run the following:

In [7]:
dj.config.save_local()

In [8]:
ls

0-Get DataJoint Ready.ipynb
1-Explore U19 data pipeline with DataJoint.ipynb
2-Analyze data with U19 pipeline and save results.ipynb
dj_local_conf.json


Notice that this created `dj_local_conf.json` in the local directory.

In [None]:
# %load dj_local_conf.json
{
    "database.host": "datajoint00.pni.princeton.edu",
    "database.password": null,
    "database.user": null,
    "database.port": 3306,
    "database.reconnect": true,
    "connection.init_function": null,
    "connection.charset": "",
    "loglevel": "INFO",
    "safemode": true,
    "fetch_format": "array",
    "display.limit": 12,
    "display.width": 14,
    "display.show_tuple_count": true,
    "database.use_tls": null
}

Inside the dj_local_conf.json, you will find saved dj.config information, and notice that database.host is set to the right value.

### Bonus: saving username and password into local config

Although now you don't have to keep on specifying `database.host` inside `dj.config`, everytime DataJoint tries to connect to the database, it'll prompt you for your username and password. Although this maybe fine when working interactively, it can be rather limiting when you want a script to run without interaction. To get around this, you can also save your username and password into the configuration as follows:

In [11]:
from getpass import getpass # use this to type password in without showing it

In [12]:
dj.config['database.user'] = 'shans'
dj.config['database.password'] = getpass('Type password:')

Type password:········


Test the connection with it

In [12]:
dj.conn()

DataJoint connection (connected) shans@datajoint00.pni.princeton.edu:3306

Now save the username and password into the local config:

In [13]:
dj.config.save_local()

**Note that this saves your password in clear readable text in that local configuration file -- very insecure.**