In [12]:
import signac
import numpy as np
import matplotlib.pyplot as plt
import os

## STEP 1: Use signac to search and narrow down our workspace.  

### GOAL: Use signac to narrow down the jobs we want to work with right now.  

- Jobs is a signac term, and in our case 1 job = 1 statepoint = 1 simulation.    
- Each job has its own directory inside of the workspace directory, and the name of the directory is the job ID.  
- We narrow down the jobs we want by choosing which set of parameters (statepoints) we want to work with.


Right now, the PTB7 workspace directory has 441 different folders (jobs/simulations/statepoints) with more coming.  

There are small systems, medium systems, large systems, 5mers, 10mers, 15mers, tons of different densities and temperatures, quenched simulations (basically meaning we start the sim at the goal temperature) and annealed simulations (run lots of simulations while slowly lowering the temperature).  These are the parameters we can choose from when telling signac to search the workspace and only find the jobs we want.

Right now, we are working with quenched simulations of small systems, so these two parameters will be sort of "hard-coded" into our signac process for now.  The varialbe parameters we will change are going to be the polymer length (5mer, 10mer, 15mer).  For example, we want to tell signac to return only small, quenched 5mer systems, and run those jobs through whatever process (finding slops of MSDs), then we'll change 5mer to 10mer and run the same process (and again with 15mer).

The other parameters (temperature and density in this case) are not given any constraints, so signac will return simulations at for every combination of temperature and density.  




The first thing we need to do is provide some method of a "filter" to signac
so that it knows how to narrow down the workspace.    

I typically use something called a dictionary which are a certain data type available in python (kind of like how lists and arrays are specific data types)  

A dictionary consists of key:value pairs, just how actual dictionaries consist
of word:definition pairs.

Here is an intuitive example, if I were to create a python dictionary that held information
about me.

`chris_dict = {"Name": "Chris", "School": "BSU", "Major": "MSE", "Status": "Graduate Student"}`

We can see the `"key": "value"` pairs. Also, dictionaries have {} type of brackets

So, lets make a dictionary for signac.  In this case the keys will be the parameters, and the 
values will be the specific value of the parameter we want to filter by.  For example:

`statepoint_dict = {"size": "small", "process": "quench", "molecule": "PTB7_15mer_smiles"}`

This tells signac I only want jobs whose size parameter = small, process parameter = quench
and molecule parameter = 15mers.

Now we have to pass this dictionary to signac.  This is where you need some familiarity with 
signac's API (functionality accessible with its python package)

    Look here: https://docs.signac.io/projects/core/en/latest/
    
Signac has project level functionality and job level functionality.  PTB7 is one project with 441 jobs.  Sorting through all of the jobs the live within a single project and retruning only certain jobs is a feature at the project level

    Project https://docs.signac.io/projects/core/en/latest/api.html#the-project
    Job https://docs.signac.io/projects/core/en/latest/api.html#the-job-class

Looking through signac's Project API, there is one called `project.find_jobs` which
we can use to narrow down which jobs we are working with.


In [16]:
# Initialize the project
project = signac.get_project()

# Define your "filter"
statepoint_dict = {"size": "small", "process": "quench", "molecule": "PTB7_15mer_smiles"}

# Tell signac to create a list of jobs that are narrowed down via the filter we give it

# Here, we are passing the dictionary above into project.find_jobs()
job_list = project.find_jobs(statepoint_dict)
# job_list is now all of (and only) the jobs that meet the requirements given in statepoint_dict
