---
title: "Python Overview"
author: "Dr. Eyal Soreq" 
start: true
date: "05/03/2021"
teaching: 15
exercises: 0
questions:
- What is Python?
- Why Python?
objectives:
- Understand what are the benefits of using Python as a central tool in your PhD.
- Understand Pythons position in the crowded data analytics ecosystem
keypoints:
- Python is an easy and versatile programming language
- It has the largest open-source community
- Python is a dynamic programming language
- Python in conjunction with Jupyter Notebook provides an optimal solution for almost all  research projects
---

# What is Python?
- [Python](https://www.python.org/) is a popular high-level, object-oriented, interpreted programming language created by a developer called Guido Van Rossum.

>  Python is powerful... and fast;  
plays well with others;  
runs everywhere;  
is friendly & easy to learn;  
is Open.  
> *Taken from Python website*

# Why Python?

- It is easy to learn and master, and its syntax is straightforward to both learn and read. 
- In the context of Data Science in general and specifically in neuroscience it is quickly the new lingua franca of the field. 
- It is popular beyond the academia

In [21]:
pip install pandas_bokeh


Collecting pandas_bokeh
  Downloading pandas_bokeh-0.5.5-py2.py3-none-any.whl (29 kB)
Installing collected packages: pandas-bokeh
Successfully installed pandas-bokeh-0.5.5
Note: you may need to restart the kernel to use updated packages.


In [23]:
import pandas_bokeh
import numpy as np

pandas_bokeh.output_notebook()


In [11]:
import pandas as pd 
import matplotlib.pyplot as plt
fig = plt.figure()
plt.rcParams.update({'font.size': 22})
df = pd.DataFrame({"Data science tool": ['Python','SQL','R','C++','Java','C','JavaScript','MATLAB','Other','Bash'],
 "% of respondents using the tool" : [86.7,42.1,23.9,21.4,18.8,18.5,16.7,12.4,10.9,9.9]})
ax = df.plot.barh(x="Data science tool",y="% of respondents using the tool" ,figsize=(10,5),fontsize=20)
ax.invert_yaxis()
fig.savefig('output.png')

In [27]:
df = pd.DataFrame({"Tool": ['Python','SQL','R','C++','Java','C','JavaScript','MATLAB','Other','Bash'],
 "pct" : [86.7,42.1,23.9,21.4,18.8,18.5,16.7,12.4,10.9,9.9]})
df.plot_bokeh(
    kind='bar',
    x='Tool',
    y='pct', 
    xlabel='Data science tool',
    ylabel='% of respondents using the tool',
    title='2020 Data Science and Machine Learning Survey',
    legend=False
)

"Taken from the Kaggle data science community “2020 Data Science and Machine Learning Survey“.

# Some basic recommendations

- Python projects can require complex development or implementation; in those cases, it is recommended to use some integrated development environment (IDE), for example, PyCharm. 
- While the use of IDE is not discouraged, I feel that it isn't necessary for applied Data Science projects (as opposed to algorithm development, for example). 
- Moreover, to strengthen the encoding of the language syntax, I recommend at the beginning to force yourself to use Jupyter lab or notebook interface only.
- However, my current favourite tool is visual studio code, which has a nice Python and Jupyter built-in support.

# Module Recommendation

- This course's main objective is to familiarize you with the use of this framework  and Python, making the writing process intuitive.
- All slides you will see were created using Jupyter, and all code snippets can be copied and pasted into your notebook.
- I urge you, however, to avoid doing so.
- Instead, write down the commands in your notebook and try to add your own comments.
- Keep in mind that you are doing this for future you, who in six months will want to know how to do something trivial, and having it in one (or more) notebooks that you have created will be worthwhile 

# What this module will cover
- This module is composed of several sections going over Python essentials such as: 
    1. Python Basics (e.g. Syntax Essentials, Keywords, Variables, Data Types, Comments and Operators)
    1. Python Data Structures (e.g. Lists, Tuples, Dictionaries and Sets)
    1. Python Programming Fundamentals (Conditions, Loops and Functions)
- The goal is to quickly give you a cheat sheet to start using Python as a data scientist.

In [12]:
df = pd.read_csv('Please_tell_us_about_your_level_of_knowledge_.csv')

In [20]:
df.T.head(11)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16
Response,259,256,253,257,251,260,254,250,261,248,258,249,255,252,246,247,Averages (where applicable):
Submitted on:,18/05/2021 11:00:11,18/05/2021 02:48:36,17/05/2021 18:46:57,18/05/2021 09:13:24,17/05/2021 17:01:30,18/05/2021 14:24:04,17/05/2021 21:33:14,17/05/2021 16:50:51,18/05/2021 20:55:33,17/05/2021 15:36:03,18/05/2021 10:11:47,17/05/2021 15:49:45,17/05/2021 21:51:52,17/05/2021 18:11:59,17/05/2021 14:57:30,17/05/2021 15:16:18,
Institution,,,,,,,,,,,,,,,,,
Department,,,,,,,,,,,,,,,,,
Course,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,Artificial Intelligence and Intelligent Systems,
Group,,,,,,,,,,,,,,,,,
ID,361.0,362.0,363.0,364.0,365.0,366.0,367.0,368.0,369.0,370.0,371.0,372.0,373.0,374.0,376.0,377.0,
Full name,Clara Bersch,Ofure Okoh,Fabian Marvin Renz,Giacomo Bignardi,Janis Keck,Jennifer Sander,Johannes Julius Mohn,John Tuff,Jonas Karolis Degutis,Maria Azanova,Maria Badanova,Meike Hettwer,Muhammad Hashim Satti,Robert Scholz,Susanne Haridi,Tamer Ajaj,
Username,clara_bersch,ofure_okoh,fabian_renz,giacomo_bignardi,janis_keck,jennifer_sander,johannes_mohn,john_tuff,jonas_degutis,maria_azanova,maria_badanova,meike_hettwer,hashim_satti,robert_scholz,susanne_haridi,tamer_ajaj,
Complete,y,y,y,y,y,y,y,y,y,y,y,y,y,y,y,y,


## Links to expand your understanding 

For those interested in learning more...

- [Conda Essentials](https://learn.datacamp.com/courses/conda-essentials)
- [Building and Distributing Packages with Conda](https://learn.datacamp.com/courses/building-and-distributing-packages-with-conda)
- [Some background on ipython and jupyter](https://www.datacamp.com/community/blog/ipython-jupyter)
- [Jupyter Notebook Tutorial: The Definitive Guide](https://www.datacamp.com/community/tutorials/tutorial-jupyter-notebook)


{% include links.md %}

In [30]:
a = 'This is some string'
b = "This is some string"
print(a == b)

True


In [31]:
a,b = 'test', "test"
print(f" Both tests are the same i.e. a={a} and b={b} and a==b is {a==b}") 

 Both tests are the same i.e. a=test and b=test and a==b is True
