# 2021 Course Syllabus  

- toc: true 
- badges: false
- comments: true
- categories: [Syllabus]


# Boot Camp 
The five-day Boot Camp pre-course will introduce the learners with programming and computing fundamentals. Each day is composed of five short (up to 25 minutes) online sessions and complimentary reading material.  The goal of this pre-course is to supply learners with little or no background with the essential vocabulary needed to be able to read and comprehend code.  The complementary material will also contain exercises to allow the learners to test their understanding of the covered material. 

#hide
https://chrisalbon.com/

## **Day One - Bash essentials** 

> note: check gwdg framework and consider doing a reduced version of this to start up python early

### **Introduction to Linux based command line interactions**
- Background & historical context 
- Basic commands 
- File operations 

### **Environment baseline **
- User environment
- Permissions 

### **Customisation of your environment**
- Aliases 
- Text editors (vi/vim, nano or emacs)
- Symbolic links 

### **CLI Programing**
- Basic scripting 
- Functions and shell scripts 
- Text manipulation (cat, echo, sed, awk, grep, tr, wc, cut)

### **Remote interaction** 
- Rsync 
- SSH 
- Git 

#hide

### Style Guide Recommendation
1. General Guidelines
    - Explicit code
    - One statement per line
    - Function arguments
    - Unpacking
1. Zen of Python
1. PEP 8
1. Conventions
    - Check if a variable equals a constant
    - Access a Dictionary Element
    - Short Ways to Manipulate Lists
    - Filtering a list
    - Modifying the values in a list
    - Line Continuations

## **Day Two - Python essentials** 

### **Jupyter notebooks and Python style** 

- What is Markdown
- General Guidelines
- Zen of Python
- Conventions

### **Python core tools**

- Printing in python
- Data Types
- Variables
- Expressions
- String Operations

### **Data Structures in Python** 

- Lists
- Tuples
- Sets
- Dictionaries

### **Programming Fundamentals using Python** 

- Conditions
- Loops
- list comprehension 
- Lambda Functions

### **Taking python to the next level** 

- Functions
- Classes 
- Modules
- Packages 


  

#hide

https://totaldatascience.com/wp-content/uploads/2019/10/p64.png

https://medium.com/@mrbriit/20-free-visualization-cheat-sheets-for-every-data-scientist-to-download-ceee741ba3ca

https://procomm.ieee.org/elements-of-visual-communication/

1. Deviation from a baseline 
1. Correlation between variables  
1. Ranking across categories 
1. Distribution of variables   
1. Temporal Change of variables    
1. Relative and absolute magnitude comparisons of variables
1. Part-to-whole Relationship 
1. Spatial magnitude, change or deviation plots
1. Flow plots 

### Visual communication basics 

1. Graphic elements attributes 
    - Shape
    - Position
    - Size
    - Color
    - Orientation 
    - Opacity 

1. Composition attributes
    - Contrast 
    - Repetition
    - Alignment 
    - Proximity
    - Hierarchy
    - Grouping
    - Sequence
    - Space




## **Day Three - Data visualization basics** 

### **Introduction to Data visualization**

- What is data visualization? 
- Why is data visualization important?
- What are Data Visualization Techniques?

### **Visual communication basics** 

- Perception and communication
- Graphic elements 
- Composition basics 

### **Introduction to exploratory data visualization**

- What is exploratory data visualization?
- How can we use exploratory plots to explore our data 
- Effective exploratory data visualization

### **Static data visualization (matplotlib and seaborn)**

- Figure elements
- Mastering the axes 
- Practical examples with code 

### **Dynamic data visualization (Plotly or Altair)**

- When to use dynamic plots
- Quick hacks
- Practical examples with code 




## **Day Four - Numpy essentials** 

### **Data types**

- Boolean
- Object
- Scalars
- Vectors, Arrays, Volumes and Beyond


### **Array objects**

- Adding, removing, sorting elements
- Indexing, and slicing arrays 
- Views and copies of arrays
- Array operations (e.g. sum, max etc. )


### **Matrix creation and manipulation** 

- Creating matrices 
- Matrix operations
- Vectorization
- Concreate examples 


### **Random sampling and distributions**

- Uniform random data creation
- Using Numpy to perfom permutations 
- Using Numpy to sample from a distribution

### **Numpy and scipy statistics** 

- Order statistics
- Averages and variances
- Correlation and histograms


#hide
https://github.com/guipsamora/pandas_exercises

## **Day Five - Pandas essentials** 

### **Series and DataFrames in pandas** 
- Creating, Updating, Extending and Saving Pandas Data Structures
- Data types 
- Dataframe methods

### **DataFrames navigation basics** 
- Indexing and slicing 
- Sorting and reordering
- Multi-index and reindexing

### **Data cleaning basics** 
- Filtering, Sorting and dropping
- Concatenatinating, merging and joining
- Imputing, removing and flagging

### **Data transformation basics** 
- Transformation, 
- Rescaling and
- Encoding 

### **data summarization basics** 
- Reshaping and melting
- Grouping, pivot and pivot tables 
- Statistics and reports 




# **Artificial Intelligence & Intelligent Systems**

This course aims to provide both intuitions and applied knowledge on fundamental methods in modern data science and how to use them to address neurocognitive questions. It will also provide you with a strong foundation on how to conduct reproducible research using python and central scientific python packages. Finally, each week a student pair will be asked to review and present a scientific paper that exemplifies the centrality of that week's concepts in the neurocognitive field.

## **Week One - Intelligence**

> Note: 15 minutes (10 min presentation + 5 min discussion) 

### **Course introduction and guidelines** 

> Note: 25 minutes (15 min presentation + 10 min discussion) 

### **What is Intelligence**

> Note: 25 minutes (15 min presentation + 10 min discussion) 

### **What is Artificial Intelligence**

> Note: 25 minutes (15 min presentation + 10 min discussion) 

### **Neuroscience and intelligence**

#hide

wiki : [Neuroscience and intelligence
](https://en.wikipedia.org/wiki/Neuroscience_and_intelligence#Humans)

## **Week Two - Reproducible research**

### **Ten Simple Rules for Reproducible Research**

> Note: 30 minutes (20 min students presentation + 10 min discussion) 

Article: [Ten Simple Rules for Reproducible Research in Jupyter Notebooks](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007007&utm_campaign=The%20ML%20Times%20&utm_medium=email&utm_source=Revue%20newsletter)
 
 
### **Markdown basics in Jupyter**  

> Note: 30 minutes Hands-on exercise 

1. Markdown Syntax
1. Hyperlinks And References
1. Mathematical Equations And LaTeX
1. Creating Tables 

### **From Replicability to Reproducibility**

> Note: 30 minutes presentation


1. Levels of Reproducibility
1. Things to avoid
1. Tools of the trade 

#hide
 
Article: [A Cognitive Interpretation of Data Analysis](https://onlinelibrary.wiley.com/doi/full/10.1111/insr.12028?casa_token=m3tegURGHpQAAAAA%3A4oSS06AOaulGnl_a1NM3sLdwTEHtb5EsvrDUs8sgQmSYNLx7JBTIH27aMlz_UJ0FzNuiG6RvPeuEUOB8)

## **Week Three - Data Acquisition, cleaning and Curation**

### **Reproducibility in Cognitive Neuroscience**

> Note: 30 minutes (20 min students presentation + 10 min discussion) 

Article: [Progress Toward Openness, Transparency, and Reproducibility in Cognitive Neuroscience](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5545750/)

### **Dealing with big data** 

> Note: 45 minutes Hands-on exercise 

1. Downloading and storing data locally 
1. Cleaning data in a non-destructive way 
1. Creating a pre-processing report 

### **Data curation challenges in different domains** 

> Note: 15 minutes presentation on

1. Mapping the data lake of Cognitive Neuroscience
1. Modality general and specific pitfalls 
1. Challenges in Data synthesis 

## **Week Four - Exploratory Data Analysis**

### **Point of View: Open exploration**

> Note: 30 minutes (20 min students presentation + 10 min discussion) 

Article: [Point of View: Open exploration](https://elifesciences.org/articles/52157)

### **Exploratory Data Analysis**

> Note: 45 minutes Hands-on exercise 

1. Generating an EDA preliminary report 
1. Descriptive EDA Tables
1. EDA Plots

### **Data Analysis exploratory cycle** 

> Note: 15 minutes presentation on
> Note: Contrast with confirmatory 

1. Preliminary inspection of data using EDA
1. Common First Steps In Any EDA
1. Different Data Types
1. Populations, Samples And Distributions
1. Things to avoid

## **Week five - Data mining and feature engineering**

### **Fantastic Features and Where to Find Them**

> Note: 30 minutes (20 min students presentation + 10 min discussion) 

Article: [Fantastic Features and Where to Find Them: Detecting CognitiveImpairment with a Subsequence Classification Guided Approach](https://kopernio.com/viewer?doi=arxiv%3A2010.06579&token=WzM2MTksImFyeGl2OjIwMTAuMDY1NzkiXQ.8Aq4-_93nlp2uj2OIv-xZ1cyTBo)

### **Creating features based on domain expertise **

> Note: 30 minutes of Hands-on exercise 

1. Downloading and storing data locally 
1. Cleaning data in a non-destructive way 
1. Creating a pre-processing report 


### **A Brief Introduction to Feature Engineering**

> Note: 30 minutes presentation on

1. Variable transformation.

    - scale change
    - simplify non-linear relationships into linear relationships
    - transform skewed distribution into Symmetric distribution

2. Transformation methods

    - Logarithm
    - Square / Cube root
    - Binning

3. Variable / Feature creation.
    - Derived variables
    - Dummy variables
    - Domain expert variables 

## **Week six - Dimensionality reduction and unsupervised learning** 

### **Ghosts in machine learning for cognitive neuroscience**

> Note: 30 minutes (20 min students presentation + 10 min discussion) 

Article: [Ghosts in machine learning for cognitive neuroscience: Moving from datato theory](https://www.sciencedirect.com/science/article/abs/pii/S1053811917306663)

### **A Brief Introduction to Unsupervised Learning (part 1)**
Note: on week three there is ICA covered in the methods   

1. What is Dimensionality reduction?
1. Feature selection/elimination
1. Feature extraction
1. Visualizing high dimensional spaces 

> Note: 30 minutes online presentation

### **Dimensionality reduction a multi-edged sword**

> Note: 30 minutes of Hands-on exercise 

1. Understanding Dimensionality reduction application and theory 
    1. Principal component analysis basics
    1. Global vs local non-linear approaches
    1. Visualising complex states 

## **Week seven - Data clustering** 

### **A Brief Introduction to Unsupervised Learning (part 2)**

> Note: 30 minutes presentation

1. Cluster Analysis Basics
    - What is Cluster Analysis?
    - Applications of Cluster Analysis
    - Putting Clustering into Context
    - The Benefits of Cluster Analysis
    - The Different Types of Cluster Analysis
1. What are proximity metrices
    - Distance metrics
    - Similarity metrics    
1. Intution behind common Cluster Algorithms
1. Clustering performance evaluation for Known classes and Unknown classes 
    

### **Visual field map clusters** 
TODO: change example with brain volume as a function of sex 

> Note: 30 minutes (20 min students presentation + 10 min discussion) 

Article: [Visual field map clusters in human frontoparietal cortex](https://elifesciences.org/articles/22974)

### **Comparing clustering algoritems on controlled toy data-sets**

> Note: 30 minutes of Hands-on exercise 

1. K-Means Clustering
2. Spectral Clustering
3. Density-Based Spatial Clustering of Applications with Noise
4. Agglomerative Hierarchical Clustering

## **Week eight - Supervised learning (classification models)**

### **A Brief Introduction on Supervised Learning (part 2)**

> Note: 30 minutes presentation

1. Introduction to classification models
1. Parametric and Non-parametric classification
1. Benefits and Challenges of classification models
1. Intution behind common classification Algorithms
1. Classification performance evaluation metrics


### **Classification of Early and Late MCI Using rs-fMRI**

> Note: 30 minutes (20 min students presentation + 10 min discussion) 

[Classification of Early and Late Mild Cognitive Impairment Using Functional Brain Network of Resting-State fMRI](https://www.frontiersin.org/articles/10.3389/fpsyt.2019.00572/full)




### **Comparing classification algoritems on controlled toy data-sets**

> Note: 30 minutes of Hands-on exercise

1. K-nearest neighbors  
1. Logistic Linear Regression
1. Support Vector Machines
1. Classification trees 

## **Week nine - Supervised learning (regression models)**

### **A Brief Introduction on Supervised Learning (part 1)**

> Note: 30 minutes presentation

1. Introduction to Predictive regression Modeling
1. Parametric and Non-parametric Models
1. Benefits and Challenges of Predictive regression Modeling
1. Intution behind common Predictive Algorithms
1. Predictiction performance evaluation metrics

### **Predicting general intelligence from fMRI**

> Note: 30 minutes (20 min students presentation + 10 min discussion) 

[A distributed brain network predicts general intelligence from resting-state human neuroimaging data](https://royalsocietypublishing.org/doi/full/10.1098/rstb.2017.0284)


### **Comparing regression algoritems on controlled toy data-sets**

> Note: 30 minutes of Hands-on exercise 

1. Linear Regression
1. From Lasso to Ridge Regression
1. Support Vector Machines 
1. Polynomial regression

## **Week ten - Supervised learning (ensemble models)** 

### **A Brief Introduction on Supervised Learning (part 3)**

> Note: 30 minutes presentation

1. What is ensemble learning?
1. Benefits and Challenges of ensemble learning models
1. Intution behind bagging, boosting and stacking
1. Opening the black box - what insights can we gain from looking inside ensemble models

### **Improving fluid intelligence predictive modelling using Bootstrap aggregation**

> Note: 30 minutes (20 min students presentation + 10 min discussion) 

[Bootstrap aggregating improves the generalizability   of   Connectome Predictive Modelling](https://www.biorxiv.org/content/10.1101/2020.07.08.193664v1.full)

### **End of course individual assignment** 

> Note: 30 minutes overview and discussion of the end of course data-set and assignment requirements