## Python und Jupyter Notebook

https://realpython.com/jupyter-notebook-introduction/

The Jupyter Notebook is an open source web application that you can use to create and share documents that contain live code, equations, visualizations, and text.

The name, Jupyter, comes from the core supported programming languages that it supports: Julia, Python, and R. Jupyter ships with the __IPython__ kernel, which allows you to write your programs in Python, but there are currently over 100 other kernels that you can also use.

In [None]:
# Listen
thislist = ["apple", "banana", "cherry", "apple", "cherry"]

In [None]:
print(len(thislist))

In [None]:
list1 = ["apple", "banana", "cherry"]
list2 = [1, 5, 7, 9, 3]
list3 = [True, False, False]

In [None]:
print(thislist[1])

In [None]:
print(thislist[-1])

In [None]:
thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]
print(thislist[2:5])

In [None]:
thislist = ["apple", "banana", "cherry", "orange", "kiwi", "melon", "mango"]
print(thislist[:4])

In [None]:
thislist = ["apple", "banana", "cherry"]
if "apple" in thislist:
    print("Yes, 'apple' is in the fruits list")

In [None]:
# Change values
thislist = ["apple", "banana", "cherry"]
thislist[1] = "blackcurrant"
print(thislist)

In [None]:
thislist = ["apple", "banana", "cherry", "orange", "kiwi", "mango"]
thislist[1:3] = ["blackcurrant", "watermelon"]
print(thislist)

In [None]:
thislist = ["apple", "banana", "cherry"]
thislist[1:3] = ["watermelon"]
print(thislist)

In [8]:
# Dictionaries
data = {
 'student' : ['Tom', 'Jerry', 'Gloria', 'Hillary'],
 'age' : [21, 34, 45, 67],
 'gender' : ['Male', 'Female', 'Female', 'Male']
 }
data

{'age': [21, 34, 45, 67],
 'gender': ['Male', 'Female', 'Female', 'Male'],
 'student': ['Tom', 'Jerry', 'Gloria', 'Hillary']}

### Programmbibliotheken (modules oder packages) installieren und importieren
- Installation mit PIP oder CONDA
- Importieren nach Bedarf in dem Code

#### Wichtige Bibliotheken sind:
- __Numpy__. Einfache Handhabung von Vektoren, Matrizen oder generell großen mehrdimensionalen Arrays
- __Pandas__. Verwaltung von Daten und deren Analyse. Datenstrukturen und Operatoren für den Zugriff auf numerische Tabellen und Zeitreihen.
- __Matplotlib__. Mathematische Darstellungen aller Art.
- __Pyodbc__. Macht Zugriff auf ODBC Datenbanken (z.B. SQL Server) einfach.
- __Plotly__. Interactive Diagramme mit Qualität für Veröffentlichungen oder Dashboards

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import pyodbc

In [None]:
# Einfacher Streudiagramm

%matplotlib inline

N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = np.pi*(15*np.random.rand(N))**2

plt.scatter(x, y, s=area, c = colors, alpha=0.5)
plt.show()

In [None]:
# Numpy
# Array creation
a = np.array([2,3,4])

a = np.arange(15).reshape(3, 5)

In [None]:
a = np.array( [20,30,40,50] )
>>> b = np.arange( 4 )
>>> b
array([0, 1, 2, 3])
>>> c = a-b
>>> c
array([20, 29, 38, 47])
>>> b**2
array([0, 1, 4, 9])
>>> 10*np.sin(a)
array([ 9.12945251, -9.88031624,  7.4511316 , -2.62374854])
>>> a<35
array([ True,  True, False, False])


Skalarprodukt

>>> A = np.array( [[1,1],
...                [0,1]] )
>>> B = np.array( [[2,0],
...                [3,4]] )
>>> A * B                       # elementwise product multiply
array([[2, 0],
       [0, 4]])
>>> A @ B                       # Matrizenmultiplikation matmul
array([[5, 4],
       [3, 4]])
>>> A.dot(B)                    # Skalarprodukt dot
array([[5, 4],
       [3, 4]])

Bei vectoren![image.png](attachment:image.png)

In [None]:
>>> B = np.arange(3)
>>> B
array([0, 1, 2])
>>> np.exp(B)
array([1.        , 2.71828183, 7.3890561 ])
>>> np.sqrt(B)
array([0.        , 1.        , 1.41421356])
>>> C = np.array([2., -1., 4.])
>>> np.add(B, C)
array([2., 0., 6.])

### Installation von Jupyter Notebook

- Python installieren
- Jupyter installieren
- Jupyter starten

### Pandas

Pandas hat mehrere Datenstrukturen: Series, DataFrames

- A Series is 1-dimensional labelled array that can hold data of any type (integer, string, float, python objects, etc.). It’s axis labels are collectively called an index.
- A DataFrame is 2-dimensional labelled data structure with columns

In [5]:
data = np.array(['Tom','Jerry','Nick','Harry','Ruth','Gloria'])
names = pd.Series(data)
names

0       Tom
1     Jerry
2      Nick
3     Harry
4      Ruth
5    Gloria
dtype: object

In [6]:
# With index
names = pd.Series(data, index=[100,101,102,103,104,105])
print (names)

100       Tom
101     Jerry
102      Nick
103     Harry
104      Ruth
105    Gloria
dtype: object


In [13]:
# Accesing
names[100]

'Tom'

In [15]:
names.iloc[2]

'Nick'

In [9]:
# From dictionary

data = {
 'student' : ['Tom', 'Jerry', 'Gloria', 'Hillary'],
 'age' : [21, 34, 45, 67],
 'gender' : ['Male', 'Female', 'Female', 'Male']
 }

Student = pd.Series(data)
print (Student)

student    [Tom, Jerry, Gloria, Hillary]
age                     [21, 34, 45, 67]
gender      [Male, Female, Female, Male]
dtype: object


In [18]:
# Dataframes
data = {
 'name': ['Kwadwo', 'Nana', 'Kwame', 'Naa'],
 'age': [20, 19, 22, 21],
 'favorite_color': ['red', 'orange', 'green', 'purple'],
 'grade': [67, 78, 90, 12]
 }
df = pd.DataFrame(data)
df

Unnamed: 0,name,age,favorite_color,grade
0,Kwadwo,20,red,67
1,Nana,19,orange,78
2,Kwame,22,green,90
3,Naa,21,purple,12


In [19]:
df.columns

Index(['name', 'age', 'favorite_color', 'grade'], dtype='object')

In [20]:
df.values

array([['Kwadwo', 20, 'red', 67],
       ['Nana', 19, 'orange', 78],
       ['Kwame', 22, 'green', 90],
       ['Naa', 21, 'purple', 12]], dtype=object)

In [21]:
df.shape

(4, 4)

In [22]:
df.sort_values(by='age')

Unnamed: 0,name,age,favorite_color,grade
1,Nana,19,orange,78
0,Kwadwo,20,red,67
3,Naa,21,purple,12
2,Kwame,22,green,90


In [23]:
# Slicing
df[['age','grade']] #Display the age and grade columns only

Unnamed: 0,age,grade
0,20,67
1,19,78
2,22,90
3,21,12


In [35]:
df.loc[:2] #Display the first three rows

Unnamed: 0,name,age,favorite_color,grade
0,Kwadwo,20,red,67
1,Nana,19,orange,78
2,Kwame,22,green,90


In [36]:
# Selection by Position
df.iloc[:2]

Unnamed: 0,name,age,favorite_color,grade
0,Kwadwo,20,red,67
1,Nana,19,orange,78


In [41]:
df.grade>60

0     True
1     True
2     True
3    False
Name: grade, dtype: bool

In [42]:
df[df.grade>60]

Unnamed: 0,name,age,favorite_color,grade
0,Kwadwo,20,red,67
1,Nana,19,orange,78
2,Kwame,22,green,90
