# Problem Set 12: Introduction to Python

Author: Greg Wray

## Instructions

Create a markdown document within JupyterLab and answer the questions below using code blocks that generate the correct outputs. We encourage you to include explanatory text in your markdown document. 

Write "robust" solutions wherever possible. A good rule of thumb for judging whether your solution is appropriately "robust" is to ask yourself "If I added additional observations or variables to this data set, or if the order of variables changed, would my code still compute the right solution?"

Make sure your markdown is nicely formatted -- use headers, bullets, numbering, etc so that the structure of the document is clear.

When completed, title your markdown file as follows (replace `XX` with the assignment number, e.g. `01`, `02`, etc):

-   `netid-assignment_XX-Fall2023.qmd`

Submit both your markdown file and the generated HTML document via the Assignments submission section on Sakai.


## Problems 

1.  Lists are one of the most common data structures in Python programs. You can create a list using an assignment statement:   
	`tree_list = ['red oak', 'loblolly pine', 'sassafras', 'tulip magnolia']`  

* Create `tree_list`. Use use `type(tree_list)` to confirm that it is indeed a list object.
	
* Apply the `.sort()` method to `tree_list` (hint: see slide deck from class). What is the item returned by `tree_list[2]`?

* The `.append()` method provides a quick way to add an item to an existing list. Experiment to learn how to use `.append()` to add `'longleaf pine'` to `tree_list`. What happens if you try to add two items at once? 

* Sort `tree_list` once again using `.sort()` so that the new item is in the right location. Now try `.sort()` with the optional argument `reverse = True` and (separately) with the optional argument `key = len`. How do these arguments change the behavior of `.sort()`?

* It is often useful to know whether a given item is in a list, particularly when the list is long or is updated while your program is running. You can test for membership using the `in` operator. Experiment to construct an expression that tests whether `'red oak'` is in the list and returns `True`.

* There are dozens of methods that work with lists. Experiment and use whatever resources you prefer to find out what the following methods do when applied to a list: `.reverse()`, `.pop()`, and `.index()`. Write a one-sentence description of each method and provide an example using `tree_list`.

2. Strings can initially be frustrating for programmers familiar with R, because they are implemented differently in Python. Copy and paste the text below into a code block to create a string object that you can work with:  
`long_str = 'Jabberwocky\nBy Lewis Carroll\n’Twas brillig, and the slithy toves\n\tDid gyre and gimble in the wabe:\nAll mimsy were the borogoves,\n\tAnd the mome raths outgrabe.'`  (If English is not your first language, don't worry if many of these words are unfamiliar. The poet used the most rare and obscure words he could find -- it's intended to be fun for exactly this reason.)

* Use `len()` to find out how many characters are in `long_str`. How are escape sequences like `\n` treated: as one or two characters? The `.count()` method can be used to tally the number of times a substring occurs. How many `'w'` characters are in `long_str`? 

* Use `print()` to see how `long_str` is formatted. Note how escaping with `\n` and `\t` is used to indicate different kinds of whitespace. How many tabs does `long_str` contain? 

* Square-bracket indexing is a common way to access substrings. What substring is returned by the slice `[70:74]`? You can locate the start index of a substring using the `.find()` method. Use `.find()` to locate the start index of `'toves'`. What is the slice that  returns `'toves'`?


3. Processing strings into smaller strings is called *splitting*. Your objective for this problem is to write code that splits `long_str` into six separate strings, each of which corresponds to one of the lines of the formatted output generated by the `print(long_str)` statement. 

* The core of this process is simpler than it may seem: use the `.split()` method and supply `'\n'` (the escape sequence corresponding to return) as the argument. What type of data structure is returned? What are the items in it? What happened to the escape sequences for return?

* Now all you need to do is clean up the output to remove the escape sequences that indicate tabs. You could write a loop to check each item and remove `'\t'` whenever present. But an easier way is simply to remove the tabs before splitting. You can do this with the `.replace()` method, which takes two arguments: the substring to search for and the substring to replace it with. 

* Some methods alter data objects *in place*, meaning that they change the original data. Others return a *copy* of the original, modifed according to the method, but leave the original unchanged. If the method returns a copy, you need to assign the result to a new variable name in order to save it for future use. Experiment to find out whether `.split()` and `.replace()` modify in place or return a copy. 

* You should now be able to split `long_str` into six separate lines and eliminate all of the escape sequences using just two lines of code.   

4. Next, we want to count the number of characters on each line in the output from the program you wrote for problem 3 and figure out which one is the longest. In case you could not get that code to work properly, here is the output from the previous step: 
`['Jabberwocky', 'By Lewis Carroll', '’Twas brillig, and the slithy toves', 'Did gyre and gimble in the wabe:', 'All mimsy were the borogoves,', 'And the mome raths outgrabe.']`

* Write code that creates a list containing the number of characters on each of the six lines you extracted from `long_str`.  

* Now add code to identify the *index* of the longest line (not its length). In other words, your program should return a single integer: the number of the line that has the most characters. Since this is Python, your result should be zero-based! That way, you can use it to index the original object correctly (for example, to retrieve the longest line). 

5. This problem and the next two build on the boolean network model that Paul presented in class. Starting with the function definitions and the basic boolean network model ("core of the simulation"), your task is to extend the code to check every possible starting condition in a single run. The first step is to explicitly enumerate all the possible staring conditions. With 3 nodes and 2 possible conditions for each, the number of permutations is 8. These need to be placed into a nested list of lists so that your program can step through each unique combination of starting conditions, one at a time:     
`[[True, True, True], [True, True, False], ...]`  
There are 3 ways you could construct this nested list: (1) write it out by hand; (2) use nested for loops (one for each node and one for each condition); or (3) use a method in the `itertools` module, which is part of the Python Standard Libary. Any of these options is okay for solving this problem. However, note that the first option is simple with a small number of starting conditions, but does **not** scale well if there are more nodes! 

6. Once you have enumerated all the possible starting conditions, the next step is to place the simulation inside a `for` loop so that it carries out a simulation for each set of starting conditions. During each loop, you will need to extract the next item in the nested list and assign its contents to the initial values of `V1`, `V2`, and `V3` before running the simulation. 

7. Testing every possible set of starting conditions allows us to evaluate their impact on the outcome of the simulation. Let's now automate the process of comparing those outputs. One way to think about the behavior of boolean networks is that the system sometimes oscillates between 2 or more states and sometimes becomes fixed in single state. Let's focus on the simpler case and count how many starting conditions lead to behavior that becomes fixed. We can define "fixed" as a situation where the last 10 iterations have precisely the same state. When the simulation finishes looping, the lists `V1`, `V2`, and `V3` contain the information you need to  determination whether fixed is True or False. Store that value in a list where each item is itself a list that contains two items: (1) the starting condition and (2) the truth value for fixed. You can either represent the starting condition as another list (e.g., `[True, False, False]`) or encode it as a string represention (e.g., `'TFF'`). **Bonus:** are you able to draw any conclusion about the boolean model from your measurement? 

8. **Notebook:** Choose something that you learned from lecture, the hands-on coding in class, or your own investigation that you think will be valuable for your future programming endeavors. Using text or a mix of text and code, create an entry for your notebook. Add this to your notebook and include it here. 

9. **Thursday lunch:** Identify something that you learned from the presentation or discussion on Thursday that you found valuable. Provide a brief reflection here (1-5 sentences) and include code or pseudo-code if useful. (Hint: consider adding this to your notebook as well.)