# User Guide

## 1) Generating Results
1. Load target through either:
    1. Defining a list of targets
    2. Loading processed targets
2. Generate results and indicate directory

#### Results Output

The results are saved in the results/directory_you_indicated folder. There are several files produced, namely:
1. directory_results.csv - contains main results
2. directory_linear_ratio.png - linear likelihood ratio curve
3. directory_linear_sites.png - actual sites vs linear predicted sites
4. directory_threshold_ratio.png - threshold likelihood ratio curve
5. directory_threshold_sites.png - actual sites vs threshold predicted sites
6. directory_gc_boxplot.png - growth coefficients boxplot
7. directory_p_graph.png - p_samples vs p_controls

### 1a) Generate results through the main menu

1. Run the "main_menu.py" python file (command line: python main_menu.py)
2. Load targets*:
    - Option 1: Define list of targets
    - Option 3: Load processed targets
3. Generate results (Option 7)

### 2a) Generate results through a script

Another option is to utilize the run_experiment method of main_module.py.

1. Import main_module
2. Use the run_experiment module with atleast the following parameters:
    - results_path
    - target_list_file
    - output_directory
3. Modify other parameters if necessary
4. Run python script

Below is an example of how to use it and the default values of its parameters.

In [None]:
import os
import main_module as mm

#--------------------------#
# run_experiment parameters
#--------------------------#
# results_path: directory where results folder will be created
# target_list_file: filename of target list in tests folder (no extension)
# output_directory: directory of results (will be in results folder)
# population_data_name="Eriksson"
# controls="All"
# date_window=10000
# user_max_for_uninhabited=-1
# clustering_on = False
# critical_distance=1
# filter_date_before=-1
# filter_not_direct=False
# filter_not_figurative=False
# filter_not_controversial = False
# perform_cross_validation=False
# number_of_kfolds = 100
# minimum_likelihood_ratio = 0.0000001
# min_date_window=0
# critical_time=10000
# filter_min_date=-1
# filter_max_date=-1
# filter_min_lat=-1
# filter_max_lat=-1

base_path = os.getcwd()
mm.run_experiment(base_path, "rockpaintings v8", "full_e")
mm.run_experiment(base_path, "rockpaintings v8", "full_e_dir", filter_not_direct=True)

## 2) Changing Default Values

We can find default values in MainProgram's init method. 

In [1]:
def __init__(self):
    self.base_path = os.getcwd()
    self.filters_applied = ""
    self.population_data_sources = []
    self.target_list=[]
    self.dataframe_loaded = False
    self.dataframe = pd.DataFrame()
    self.controls = "All"
    self.controls_dataframe = pd.DataFrame()
    self.default_statistic='Mean'
    self.clustering_on=False
    self.critical_distance=250
    self.date_window=10000
    self.number_of_kfolds = 10
    self.minimum_likelihood_ratio = 0
    self.perform_cross_validation = False
    self.user_max_for_uninhabited = 1000
    self.default_mfu = True
    self.min_date_window = 0
    self.critical_time = 10000

## 3) Adding Additional Population Data

You need two files:
1. binary file: contains the lats, lons, and densities
2. info file: indicates necessary details about the binary file (meta-data)

These files should be put inside the population_data folder. 

### Binary File

Your binary file should contain lats, lons, times, and densities. The binary file has the following variables:
1. lats_txt: list of latitudes
2. lons_txt: list of longitudes (360 should be added to lons < 0)
3. ts_txt: list of times
4. dens_txt: 2D list of densities according to time (row) and latlon (column)

You can create a binary file using numpy's savez_compressed function.

For example, you have the following densities with time and location:
* 1000, time=0, lat=45, lon=90
* 2000, time=0, lat=50, lon=-50
* 3000, time=1, lat=0, lon=0

lats_txt value:

In [12]:
lats_txt = [45, 50, 0]

lons_txt value:

In [13]:
lons_txt = [90, (-50 + 360), 0]

ts_txt value:

In [14]:
ts_txt = [0, 1]

dens_txt value:

In [15]:
dens_txt = [[1000, 2000, 0],
[0, 0, 3000]]

Then save these values in a compressed file:

In [16]:
import numpy as np
np.savez_compressed("binary.npz",lats_txt=lats_txt, lons_txt=lons_txt,ts_txt=ts_txt,dens_txt=dens_txt)

### Info file

Your info file should contain the following information:
1. time multiplier: integer - multiplier for values in time array (use 1 if using actual time)
2. bin size: integer - size of bin when generating distributions
3. max population: integer - max population in your density dataset
4. max for uninhabited: integer - default max population considered as uninhabited
5. is active: boolean - set as active population data by default
6. ascending time: boolean - if time array is arranged in ascending order


Here is an example of the format, using Eriksson's info file:

In [None]:
time_multiplier	25
bin_size	200
max_population	6000
max_for_uninhabited	1000
is_active	True
ascending_time	False

The label and the value are separated by a single tab (\t).