# Data Science in Psychology & Neuroscience (DSPN): 

## Lecture 2. Introduction, continued

### Date: August 20, 2020

### To-Dos From Last Class:

* Anaconda Navigator installation
* Github setup
    * Basic: <a href="https://github.com/hogeveen-lab/DSPN_Fall2020_git">DSPN Fall 2020 Github</a>
        * Download Zip, extract, replace / update existing folder on your machine
    * Advanced: <a href="https://github.com/hogeveen-lab/DSPN_Fall2020_git/tree/master/misc_refs/git_related">tutorial and video of git clone on a mac</a> (thanks to Laura & Ryan)
    * If untenable, let me know and I can share a working Dropbox folder!



### Today:

* Coding style and working in Jupyter
* Create Jupyter Notebook and start programming
* Discuss final assignment

### Homework:

* Enjoy life!


# Note on pacing:

| Hamilton would be a bad Data Scientist | Burr would be a good Data Scientist |
| :---: | :---: |
| <img src="img/hamilton/exhibits_no_restraint.gif" width="367"> | <img src="img/hamilton/wait_for_it.gif" width="220"> |

# Coding Style

## What is it?
* How YOUR code looks
    * Analogous to formatting preferences in Word docs
    * Differs person-to-person
* Things to consider
    * Comments
        * Be succinct, but too much often better than not enough
    * Indentation
        * Tabs or spaces? If spaces, how many?
    * Code organization
        * Is there enough white space? Does the sequence make sense?
    * Naming variables and functions
        * camelCase vs. under_scores? 
        * Common logic for id'ing variables, data frames, plots, and models
    * Consistency is key
* Overarching principle: Code should be written to minimize the time it would take for someone else to understand it.

<img src="img/readable_code.png">

## Cells

__Sections of code__
* Analogous to paragraphs or paper sections
* Used in MATLAB <%%>, Python <#%%>, and RMarkDown <```>

## Jupyter Cells

* Markdown (what this is)
    * A _way_ to __specify__ all ___formatting___ within the text itself
* Code (cell where you run actual Python code)

## Markdown Formatting Tips

## Header levels...
### ...are...
#### ...specified...
##### ...using...
###### ...pound signs.

* Can do bulleted lists using asterisks:
    * tabbing to indent subsequent levels
        * as many times as you need
* back to level one

1. Can also do numbered lists 
2. Number, period, space then your entry
    1. Designate subpoints by tabbing + restarting enumeration
3. Continue original list

Here is a <a href="https://medium.com/ibm-data-science-experience/markdown-for-jupyter-notebooks-cheatsheet-386c05aeebed">cheatsheet</a> with a bunch of other Markdown formatting stuff.

## Web browser
* Jupyter coding happens in a web browser
    * But, it is not on the web
* Exporting and sharing jupyter notebooks
    * Export to HTML or PDF if you have LaTeX installed
    * Push to Github and work on shared code base across lab
    
## Writing Python code in Jupyter

* Python is an '__object-oriented__' programming language
    * Nearly everything you create, update, and output in Python is going to be some kind of __object__
* Python objects can be __variables__

In [39]:
variable = 6
print('your first variable equals:' ,variable)

your first variable equals: 6


* Variables are most often __integers__, __floats__, or __strings__

In [40]:
floating_point_number = 3.141592653589793238
integer = 3
string = 'pi'
print(string,'is equal to:',floating_point_number,'which is approximately', integer)

pi is equal to: 3.141592653589793 which is approximately 3


* Multiple variables can be assigned at once in Python

In [70]:
multi, variable, line = floating_point_number, integer, string
print(multi)
print(variable)
print(line)

3.141592653589793
3
pi


* __Mathematical operators__ can be applied to numerical variables
    * Order of operations follows __B() E** D/ M* A+ S-__

In [41]:
variable_2 = 14
math_answer = (variable_2-variable)**2 / 30
print('your first mathematical solution is:', math_answer)

your first mathematical solution is: 2.1333333333333333


* Division is a bit weird... 
    * if you want the closest # and it's remainder you can do.

In [42]:
math_answer_2 = (variable_2-variable)**2 // 30 # // to get a rounded integer of the solution
remainder = (variable_2-variable)**2 % 30 # % to get the remainder
print('the answer is',math_answer_2,'remainder',remainder)

the answer is 2 remainder 4


* Python objects might also be __collections__ (e.g. data frames and lists)

In [56]:
list_1 = [-96,-66,-87,86,-88,-25,24,59,-99,59]
print('here is your first list :',list_1)

here is your first list : [-96, -66, -87, 86, -88, -25, 24, 59, -99, 59]


* Python objects might also be __plots__
    * Won't get into this yet...

### Write a code that prints "Hello World!"

In [74]:
# boring version -- note: THIS IS A COMMENT. YOU CAN COMMENT OUT ANY LINES BY HIGHLIGHTING AND DOING COMMAND/CTRL+/
print('boring: Hello World! by typing it out')

# fancier version
firstword='Hello'
secondword='World!'
print('fancier:',firstword,secondword,'by concatenating strings :)')

boring: Hello World! by typing it out
fancier: Hello World! by concatenating strings :)


### Packages, sub-packages, and modules

* A ton of __packages__ have been developed
    * Some of them general purpose (e.g. Numpy, Pandas)
    * Some of them psych / neuro specific (e.g. psychopy, pygaze, MNE, Nipype, Brian, SPySort, etc etc etc)
* Those packages contain "sub-packages" which do a set of related things
* There are then individual "modules" or "functions" that do the stuff we want to do in Python

<img src="img/package_module.png" width="450">

### E.g. use numpy to generate random list of integers between 0 and 100

In [83]:
import numpy # numpy is a package
# import numpy as np
# from numpy import random as nprd # random is a sub-package within numpy

list_1 = numpy.random.randint(0,100,size=10) # randint is a specific module for generating random lists of integers
# list_1 = np.random.randint(0,100,size=10)
# list_1 = nprd.randint(0,100,size=10)

print("Random integer list: " + str(list_1))

Random integer list: [56 81 55 53 80 62 65 97 25 71]


__If you forget how to use a subpackage or module...__

<img src="img/m_baxter.png" width=400>

__Or...__

In [82]:
# numpy.random?

# numpy.random.randint?

[0;31mDocstring:[0m
randint(low, high=None, size=None, dtype='l')

Return random integers from `low` (inclusive) to `high` (exclusive).

Return random integers from the "discrete uniform" distribution of
the specified dtype in the "half-open" interval [`low`, `high`). If
`high` is None (the default), then results are from [0, `low`).

.. note::
    New code should use the ``integers`` method of a ``default_rng()``
    instance instead; see `random-quick-start`.

Parameters
----------
low : int or array-like of ints
    Lowest (signed) integers to be drawn from the distribution (unless
    ``high=None``, in which case this parameter is one above the
    *highest* such integer).
high : int or array-like of ints, optional
    If provided, one above the largest (signed) integer to be drawn
    from the distribution (see above for behavior if ``high=None``).
    If array-like, must contain integer values
size : int or tuple of ints, optional
    Output shape.  If the given shape is, e.g.,

# Final assignment

__What is the final assignment?__
* 10-15 minute presentation (pre-recorded)
* Any associated code / documentation

__What will you do for your assignment?__
* Learn a new thing, implement it, and show me what you did
    * New thing can, but doesn't have to, be related to your graduate research
    * New thing must be in some way related to data science
* Examples
    * Learn how to implement the math underlying a computational model
    * Test out a novel (to you) way of modeling an existing dataset
    * Automate (part of) an existing analysis routine

__How to get a good grade?__
* Like everything, 80% devoted to effort
* To earn the final 20%, make it clear you pushed yourself and learned something new

__Why?__
* I want you to feel empowered to learn new quantitative and computational skills

__If this is how you feel partway through...__

<img src="img/seth_idiot.gif">

__...then you are on the right path to...__

<img src="img/trevor.gif">

