# Create Dfr-Browser

This notebook provides an interface to code that creates dfr-browsers for models generated in the `topic_modeling` module. A full user guide is available for this notebook in the module's <a href="README.md" target="_blank">README</a> file. This notebook uses code originally written by Andrew Goldstone for this Dfr-browser visualization. See <a href="https://github.com/agoldst/dfr-browser" target="_blank">https://github.com/agoldst/dfr-browser</a> for Goldstone's original code and documentation.

### Info

__authors__    = 'Jeremy Douglass, Scott Kleinman, Lindsay Thomas'  
__copyright__ = 'copyright 2019, The WE1S Project'  
__license__   = 'GPL'  
__version__   = '2.0'  
__email__     = 'lindsaythomas@miami.edu'

## Settings

Every time you open this notebook, you must run the below cell before running anything else.

In [None]:
# Python imports
import os
from pathlib import Path
from IPython.display import display, HTML

# Define paths
current_dir            = %pwd
project_dir            = str(Path(current_dir).parent.parent)
current_reldir         = current_dir.split("/write/")[1]
data_dir               = project_dir + '/project_data'
json_dir               = project_dir + '/project_data/json'
model_dir              = data_dir + '/models'
metadata_dir           = data_dir + '/metadata'
metadata_csv_file      = metadata_dir + '/metadata-dfrb.csv'
metadata_file_reorder  = metadata_dir + '/metadata-dfrb.csv'
browser_meta_file_temp = metadata_dir + '/meta.temp.csv'
browser_meta_file      = metadata_dir + '/meta.csv'
browsers_dir           = current_dir + '/browsers'
project_name           = project_dir.split('/')[-1]
tmp                    = data_dir.split(project_name)
project_data_rel       = 'projects/' + tmp[0].split('/')[-2] + '/' + project_name + '/project_data/'

# Load required scripts
%run {project_dir}/config/config.py
%run scripts/create_dfrbrowser.py

# Feedback message
display(HTML('<p style="color: green;">Setup complete.</p>'))

## Create Dfr-Browser Metadata Files from JSON Files

This cell opens up each json in your project's json directory and grabs the metadata information dfr-browser needs. It creates both the `metadata_csv_file` file and the `browser_meta_file_temp` file.

In [None]:
## Running this code will delete old metadata files and create new metadata folder within project_data directory

dfrb_metadata(metadata_dir, metadata_csv_file, browser_meta_file_temp, browser_meta_file, json_dir)

## Create Files Needed for Dfr-Browser

### Select Models to Create Visualisations

**Please run the next cell regardless of whether you change anything.**

By default, this notebook is set to create Dfr-browsers for all of the models you produced using the `topic_modeling` module. If you would like to select only certain models to produce Dfr-browsers for, make those selections in the next cell (see next paragraph). Otherwise leave the value in the next cell set to `All`, which is the default. 

**To produce browsers for a selection of the models you created, but not all:** Navigate to the `your_project_name/project_data/models` directory in your project. Note the name of each subdirectory in that folder. Each subdirectory should be called `topicsn1`, where `n1` is the number of topics you chose to model. You should see a subdirectory for each model you produced. To choose which subdirectory/ies you would like to produce browsers for, change the value of `selection` in the cell below to a list of subdirectory names. For example, if you wanted to produce browsers for only the 50- and 75-topic models you created, change the value of `selection` below to `selection = ['topics50','topics75']`.

In [None]:
selection = 'All' # E.g. ['topics50','topics75']

Get names of model subdirectories to visualize and their state and scaled files. You can set values for `subdir_list`, `state_file_list`, and `scaled_file_list` manually in the cell below the next one.

In [None]:
subdir_list, state_file_list, scaled_file_list = get_model_state(selection, model_dir)

Optionally, set values manually (this cell does not need to be run if you have run the previous cell).

In [None]:
# subdir_list = []
# state_file_list = []
# scaled_file_list = []

Create and move files needed for dfr-browser, using model state and scaled files for all selected models. The cell prints output from Goldstone's `prepare_data.py` script to the notebook cell.

In [None]:
create_dfrbrowser(subdir_list, state_file_list, scaled_file_list, browser_meta_file, project_data_rel, current_dir, project_dir)

# Display links to the visualizations
display_links(project_dir, subdir_list, WRITE_DIR, PORT)

## Create Zipped Copies of Your Visualizations (Optional)

By default, browsers for all available models will be zipped. If you wish to zip only one model, change the `models` setting to indicate the name of the model folder (e.g. `'topics25'`). If you wish to zip more than one model, but not all, provide a list in square brackets (e.g. `['topics25', 'topics50']`).

In [None]:
# Configuration
models = 'All' # You can also select models with vaues like 'topics25' or ['topics25', 'topics50']

# Zip the models
%run scripts/zip.py
zip(models)