# Data Science in Psychology & Neuroscience (DSPN): 
## Lecture 2. Introduction, continued

### Date: August 25, 2022

### To-Dos From Last Class:

* Anaconda Navigator installation
* Github setup
    * Basic: <a href="https://github.com/hogeveen-lab/DSPN_Fall2022_git">DSPN Fall 2022 Github</a>
        * Download Zip, extract, replace / update existing folder on your machine
    * Advanced: <a href="https://docs.github.com/en/desktop/contributing-and-collaborating-using-github-desktop/keeping-your-local-repository-in-sync-with-github/syncing-your-branch">Syncing your branch</a>
    * If you hate both options—let me know and I can share the Dropbox folder with you instead!!
    
   ### Today:

* Coding style and working in Jupyter
* Create Jupyter Notebook and start programming
* Discuss final assignment

### Homework:

* Enjoy life!

# Note on pacing:

| Hamilton would be a bad Data Scientist | Burr would be a good Data Scientist |
| :---: | :---: |
| <img src="img/hamilton/exhibits_no_restraint.gif" width="367"> | <img src="img/hamilton/wait_for_it.gif" width="220"> |

# Coding Style

## What is it?
* How YOUR code looks
    * Analogous to formatting preferences in Word docs
    * Differs person-to-person
* Things to consider
    * Comments
        * Be succinct, but too much often better than not enough
    * Indentation
        * Tabs or spaces? If spaces, how many?
    * Code organization
        * Is there enough white space? Does the sequence make sense?
    * Naming variables and functions
        * camelCase vs. under_scores? 
        * Common logic for id'ing variables, data frames, plots, and models
    * Consistency is key
    
    
* Overarching principle: Code should be written to minimize the time it would take for someone else to understand it.

<img src="img/readable_code.png">

## Cells

__Sections of code__
* Analogous to paragraphs or paper sections
* Used in MATLAB <%%>, Python <#%%>, and RMarkDown <```>

## Jupyter Cells

* Markdown (what this is)
    * A _way_ to __specify__ all ___formatting___ within the text itself
* Code (cell where you run actual Python code)

## Markdown Formatting Tips

## Header levels...
### ...are...
#### ...specified...
##### ...using...
###### ...pound signs.

* Can do bulleted lists using asterisks:
    * tabbing to indent subsequent levels
        * as many times as you need
* back to level one

1. Can also do numbered lists 
2. Number, period, space then your entry
    1. Designate subpoints by tabbing + restarting enumeration
3. Continue original list

Here is a <a href="https://medium.com/ibm-data-science-experience/markdown-for-jupyter-notebooks-cheatsheet-386c05aeebed">cheatsheet</a> with a bunch of other Markdown formatting stuff.

## Web browser
* Jupyter coding happens in a web browser
    * But, it is not on the web
* Exporting and sharing jupyter notebooks
    * Export to HTML or PDF if you have LaTeX installed
    * Push to Github and work on shared code base across lab
    
    ## Writing Python code in Jupyter

* Python is an '__object-oriented__' programming language
    * Nearly everything you create, update, and output in Python is going to be some kind of __object__
* Python objects can be __variables__

In [1]:
variable = 9
print('your first variable equals:',variable)

your first variable equals: 9


* Variables are most often __integers__, __floats__, or __strings__

In [2]:
floating_point_number = 3.141592653589793238
integer = 3
string = 'pi'
print(string,'is equal to:',floating_point_number,'which is approximately', integer)

pi is equal to: 3.141592653589793 which is approximately 3


* Multiple variables can be assigned at once in Python

In [3]:
multi, variable, line = floating_point_number, integer, string
print(multi)
print(variable)
print(line)

3.141592653589793
3
pi


* __Mathematical operators__ can be applied to numerical variables
    * Order of operations follows __B() E** D/ M* A+ S-__

In [6]:
variable_2 = 18
math_answer = (variable_2 - variable)**2 / 9
print('your first mathematical solution is:', int(math_answer))

your first mathematical solution is: 25


* There are some cool division tricks...
    * if you want to round to the nearest integer do '//'
    * if you want the remainder after rounding, do '%'

In [9]:
math_answer_2 = (variable_2 - variable)**3 // 40
remainder = (variable_2 - variable)**3 % 40
print('the answer in integer form is:',math_answer_2,'and the remainder is:',remainder)

the answer in integer form is: 84 and the remainder is: 15


* Python objects might also be __collections__ (e.g. data frames or lists)

In [10]:
list_1 = [1,2,3,4,5,6]
print('here is your first list:',list_1)

here is your first list: [1, 2, 3, 4, 5, 6]


* Python objects might also be __plots__
    * Won't get into this yet...

### Packages, sub-packages, and modules

* A ton of __packages__ have been developed
    * Some of them general purpose (e.g. Numpy, Pandas)
    * Some of them psych / neuro specific (e.g. psychopy, pygaze, MNE, Nipype, Brian, SPySort, etc etc etc)
* Those packages contain "sub-packages" which do a set of related things
* There are then individual "modules" or "functions" that do the stuff we want to do in Python

<img src="img/package_module.png" width="450">

### E.g. use numpy to generate random list of integers between 0 and 100

In [13]:
# import numpy # numpy is a package
# import numpy as np # abbreviate as np
from numpy import random as nprd # random subpackage loaded as nprd directly

# list_1 = numpy.random.randint(0,100,size=10) # randint is a specific module for generating random lists of integers
# list_1 = np.random.randint(0,100,size=10)
list_1 = nprd.randint(0,100,size=10)

print("Random integer list: " + str(list_1))

Random integer list: [73 74 77 75 14 94 39 95 56 88]


__If you forget how to use a subpackage or module...__

<img src="img/m_baxter.png" width=400>

__Or...__