<div>
    <h1>Introduction to Python</h1>
</div>
<p>This notebook covers the basics of Python: variables, data types, loops, and functions.</p>
<p>
    Python has several built-in data types:
    <ul>
        <li><strong>Integers</strong>: Whole numbers (e.g., 1, 20, 100)</li>
        <li><strong>Floats</strong>: Decimal numbers (e.g., 1.5, 3.09)</li>
        <li><strong>Strings</strong>: Text data (e.g., "Hello, World!")</li>
        <li><strong>Lists</strong>: Ordered collections of values (e.g., ["Tehran", "Shiraz", "Yazd"])</li>
        <li><strong>Dictionaries</strong>: Key-value pairs (e.g., {"Iran": 90, "Turkey": 60, "Germany": 80})</li>
        <li><strong>Tuples</strong>: Immutable ordered collections (e.g., (3, 5, 6, 8))</li>
    </ul>
</p>


<p>In Python, we work with different <strong>data types</strong> (such as integers, strings, lists, and dictionaries).  
Each data type comes with its own <strong>functions</strong>, <strong>methods</strong>, and <strong>properties</strong>.  

- **Functions** are general tools we can use on many objects (e.g., `len()` gives the length of a list or a string).  
- **Methods** are actions that belong to a specific data type (e.g., `"hello".upper()` is a method for strings).  
- **Properties/Attributes** are characteristics of an object (for example, the shape of a NumPy array or the dtype of a Pandas Series).  

This means that every dataset or object in Python can have its own specific behaviors depending on its type.
</p>

<div>
    <h2>1. operations in python</h2>
</div>

<div>
    <h3>1.1 Math operations</h3>
</div>

In [18]:
a = 10
b = 3

print(a + b)  # Addition → 13
print(a - b)  # Subtraction → 7
print(a * b)  # Multiplication → 30
print(a / b)  # Division → 3.3333...
print(a // b) # Floor division → 3
print(a % b)  # Modulus → 1
print(a ** b) # Exponentiation (10^3) → 1000


13
7
30
3.3333333333333335
3
1
1000


<div>
    <h3>1.2 Rational operations</h3>
</div>

In [28]:
a = 13
b = 33
print(a<b)

True


In [30]:
print(a>b)

False


In [34]:
print(a == b)

False


In [36]:
print(a != b)

True


In [38]:
print(a<=b)

True


In [40]:
print(a>=b)

False


<div>
    <h3>1.3 Logical operations</h3>
</div>

In [45]:
a = True
b = False

In [47]:
print(a and b)

False


In [49]:
print(a or b)

True


In [51]:
print(not a)

False


In [53]:
True + True

2

In [55]:
True + False

1

<div>
    <h3>1.4 Assignment operations</h3>
</div>

In [85]:
a = 10
b = 5
b -= a
b

-5

In [89]:
c = 10
d = 5
d+=c
d

15

In [103]:
d *=c
d

1500

<div>
    <h2>2. Python Data Types</h2>
</div>

<div>
    <h3>2.1 String</h3>
</div>

In [135]:
#String Functions
x = 'python' #returns the length of the string
len(x)

6

In [137]:
x = 100  #converts an object to string
x = str(x)

In [139]:
type(x)

str

In [141]:
#String Methods

| Method | Description | Example | Output |
|--------|-------------|---------|--------|
| `upper()` | Converts the string to uppercase. | `"hello".upper()` | `"HELLO"` |
| `lower()` | Converts the string to lowercase. | `"HELLO".lower()` | `"hello"` |
| `title()` | Capitalizes the first letter of each word. | `"hello world".title()` | `"Hello World"` |
| `capitalize()` | Capitalizes the first letter of the string. | `"python".capitalize()` | `"Python"` |
| `strip()` | Removes leading and trailing whitespace. | `"  hello  ".strip()` | `"hello"` |
| `lstrip()` | Removes leading whitespace. | `"  hello  ".lstrip()` | `"hello  "` |
| `rstrip()` | Removes trailing whitespace. | `"  hello  ".rstrip()` | `"  hello"` |
| `replace(old, new)` | Replaces occurrences of `old` with `new`. | `"hello world".replace("world", "Python")` | `"hello Python"` |
| `split(separator)` | Splits the string into a list. | `"apple,banana,grape".split(",")` | `['apple', 'banana', 'grape']` |
| `join(iterable)` | Joins elements of an iterable into a string. | `"-".join(["a", "b", "c"])` | `"a-b-c"` |
| `find(substring)` | Returns the index of the first occurrence of `substring`. | `"hello".find("l")` | `2` |
| `index(substring)` | Same as `find()`, but raises an error if not found. | `"hello".index("l")` | `2` |
| `count(substring)` | Counts occurrences of `substring`. | `"banana".count("a")` | `3` |
| `startswith(prefix)` | Checks if the string starts with `prefix`. | `"hello".startswith("he")` | `True` |
| `endswith(suffix)` | Checks if the string ends with `suffix`. | `"hello".endswith("o")` | `True` |
| `isdigit()` | Checks if the string consists only of digits. | `"123".isdigit()` | `True` |
| `isalpha()` | Checks if the string consists only of letters. | `"hello".isalpha()` | `True` |
| `isalnum()` | Checks if the string consists only of letters and digits. | `"hello123".isalnum()` | `True` |
| `isspace()` | Checks if the string consists only of whitespace. | `"   ".isspace()` | `True` |
| `swapcase()` | Swaps uppercase and lowercase letters. | `"Hello".swapcase()` | `"hELLO"` |
| `zfill(width)` | Pads the string with zeros until it reaches `width` length. | `"42".zfill(5)` | `"00042"` |


In [144]:
sting = 'Hello, World!'
string.upper()

'HELLO, WORLD!'

In [146]:
string.lower()

'hello, world!'

In [148]:
string.title()

'Hello, World!'

In [152]:
string.capitalize()

'Hello, world!'

In [166]:
string2 = ' Hello! world     '
string.strip('')

'Hello, World!'

In [168]:
string2.lstrip()

'Hello! world     '

In [170]:
string.replace('Hello','hi')

'hi, World!'

In [172]:
string.split(',')

['Hello', ' World!']

In [194]:
string.index('H')

0

In [196]:
string.count('l')

3

<div>
     <h3> 2.2 Lists </h3>
</div>

In [207]:
list1 = ['apple','banana','Orange', 'Strawberry', 'Mango']
len(list1)

5

In [216]:
list2 = [1,4,2,8,3,5,9, 6,7,0]
sorted(list2)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [218]:
max(list2)

9

| Method                  | Description                                                    | Example               |
|-------------------------|----------------------------------------------------------------|-----------------------|
| `append(x)`             | Adds item `x` to the end of the list.                         | `lst.append(5)`       |
| `extend(iterable)`      | Extends the list by appending elements from an iterable.       | `lst.extend([6, 7])`  |
| `insert(i, x)`          | Inserts item `x` at position `i`.                             | `lst.insert(1, 5)`    |
| `remove(x)`             | Removes the first occurrence of item `x`.                     | `lst.remove(5)`       |
| `pop([i])`              | Removes and returns the item at position `i`.                 | `lst.pop(2)`          |
| `clear()`               | Removes all items from the list.                              | `lst.clear()`         |
| `index(x[, start, end])`| Returns the index of the first occurrence of item `x`.        | `lst.index(3)`        |
| `count(x)`              | Returns the number of occurrences of item `x`.                | `lst.count(3)`        |
| `sort(key, reverse)`    | Sorts the list in ascending order.                            | `lst.sort(reverse=True)`|
| `reverse()`             | Reverses the elements of the list in place.                   | `lst.reverse()`       |
| `copy()`                | Returns a shallow copy of the list.                           | `lst_copy = lst.copy()`|
| `join()`                | Joins the elements of a list of strings into one string.      | `", ".join(lst)`      |

In [228]:
# Creating a list
lst = [1, 2, 3, 4]
lst

[1, 2, 3, 4]

In [230]:
# append() - Adds an element to the end
lst.append(5)
lst

[1, 2, 3, 4, 5]

In [232]:
# extend() - Adds multiple elements
lst.extend([6, 7])  
lst

[1, 2, 3, 4, 5, 6, 7]

In [234]:
# insert() - Inserts an element at a specific index
lst.insert(2, 10) 
lst

[1, 2, 10, 3, 4, 5, 6, 7]

In [236]:
# remove() - Removes the first occurrence of an element
lst.remove(10)

In [238]:
# pop() - Removes and returns an element (default is last element)
removed_item = lst.pop(1) 
removed_item

2

In [240]:
lst.clear() 

In [242]:
lst

[]

In [244]:
lst = [1, 2, 3, 4]
index_of_3 = lst.index(3)
index_of_3

2

In [246]:
# count() - Counts occurrences of an element
lst.count(3)

1

In [248]:
# sort() - Sorts the list in ascending order
lst = [4, 1, 3, 2]
lst.sort() 

In [250]:
lst

[1, 2, 3, 4]

In [252]:
# reverse() - Reverses the order of the list
lst.reverse()
lst

[4, 3, 2, 1]

In [254]:
# copy() - Creates a copy of the list
lst_copy = lst.copy() 

In [256]:
words = ["Python", "is", "fun"]
sentence = " ".join(words)
sentence

'Python is fun'

In [258]:
words

['Python', 'is', 'fun']

In [260]:
#Selecting in Lists
list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 19, 11, 12, 14, 19, 20]

In [262]:
list[1]

2

In [264]:
list[:5]

[1, 2, 3, 4, 5]

In [266]:
zz = list[-1]
type(zz)

int

In [268]:
#list[start:stop:step]
list[1:5:2]

[2, 4]

In [270]:
list2 =[[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]]
jj = list2[0]
type(jj)

list

In [272]:
list2[0][1]

2

In [274]:
min(list2)

[1, 2, 3]

In [276]:
sum(list)

140

In [278]:
len(list2)

3

In [280]:
2 in list

True

In [282]:
list3 = [x**2 for x in range(6)]
list3

[0, 1, 4, 9, 16, 25]

In [284]:
numbers = [10, 20, 30, 40, 50]

In [286]:
max(numbers)

50

In [288]:
sum(numbers)

150

In [307]:
list4 = [1,2,3,4,5,6,7]
list4[1]

2

In [309]:
list4[1] = 100
list4

[1, 100, 3, 4, 5, 6, 7]

In [311]:
list4[0:2] = [101,102]
list4

[101, 102, 3, 4, 5, 6, 7]

In [315]:
list5 = [8,9,10]
list4 + list5

[101, 102, 3, 4, 5, 6, 7, 8, 9, 10]

In [317]:
del list4[1]

In [319]:
list4

[101, 3, 4, 5, 6, 7]

In [321]:
list4

[101, 3, 4, 5, 6, 7]

In [323]:
list4[0] = 2
list4

[2, 3, 4, 5, 6, 7]

<div>
    <h4>"The problem with lists is that we cannot directly perform mathematical operations on them. To handle numerical computations efficiently, we can use arrays instead. We'll work on that later."
    </h4>
</div>

In [82]:
numbers *2

[10, 20, 30, 40, 50, 10, 20, 30, 40, 50]

<div>
    <h3>
        2.3 Dictionary
    </h3>
</div>

In [349]:
data = {"name": "Alice", "age": 25, 'weight': 53, 'height': 167}

In [351]:
data.keys()        # Returns dict_keys(['name', 'age'])

dict_keys(['name', 'age', 'weight', 'height'])

In [353]:
data.values()      # Returns dict_values(['Alice', 25])

dict_values(['Alice', 25, 53, 167])

In [355]:
data.items()       # Returns dict_items([('name', 'Alice'), ('age', 25)])

dict_items([('name', 'Alice'), ('age', 25), ('weight', 53), ('height', 167)])

In [357]:
data.get("name")   # Get value of key → "Alice"

'Alice'

In [359]:
data.update({"city": "Berlin"})  # Adds new key-value pair

In [361]:
data

{'name': 'Alice', 'age': 25, 'weight': 53, 'height': 167, 'city': 'Berlin'}

In [363]:
data.pop("age")    # Removes key "age"

25

In [367]:
data

{'name': 'Alice', 'weight': 53, 'height': 167, 'city': 'Berlin'}

<div>
    <h2>
        3. Import libraries
    </h2>
</div>

<div>
    <h2> 3.1 Numpy </h2>
</div>

<div>
    <h4>
        The NumPy (numpy) library is used in Python for efficient numerical computing. It provides powerful tools for working with arrays, matrices, and mathematical operations.
    </h4>
</div>

In [374]:
import numpy as np

In [376]:
np.array([1,2,3,4,5,5,6])

array([1, 2, 3, 4, 5, 5, 6])

In [378]:
weight = np.array([67, 72, 89, 90])

In [380]:
weight *2

array([134, 144, 178, 180])

In [382]:
y = np.array ([[3,5,6,7],
               [6,7,8,9],
               [7,8,9,1]])
y

array([[3, 5, 6, 7],
       [6, 7, 8, 9],
       [7, 8, 9, 1]])

In [384]:
y[1]

array([6, 7, 8, 9])

In [386]:
y[2][0]

7

In [390]:
y[0:2,2:]
# y[rows, columns]

array([[6, 7],
       [8, 9]])

In [392]:
y > 6

array([[False, False, False,  True],
       [False,  True,  True,  True],
       [ True,  True,  True, False]])

In [394]:
np.random.seed(123)
np.random.rand()

0.6964691855978616

In [396]:
type(y)

numpy.ndarray

In [400]:
arr = np.array([[1, 2, 3],[4, 5, 6]])
arr

array([[1, 2, 3],
       [4, 5, 6]])

In [402]:
# Basic Properties of NumPy Arrays
arr.shape

(2, 3)

In [404]:
arr.ndim

2

In [406]:
arr.size

6

In [408]:
arr.dtype

dtype('int64')

In [410]:
arr.itemsize

8

In [421]:
arr.T

array([[1, 4],
       [2, 5],
       [3, 6]])

In [412]:
arr.nbytes

48

<div>
    <h2> Basic Properties of NumPy Arrays Methods 
    </h2>
</div>

In [453]:
#Array Manipulation Methods
arr2 = np.array([[1, 2, 3, 4],[5, 6, 7, 8]])
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [455]:
arr2.reshape(4,2)

array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

In [457]:
arr2.flatten()

array([1, 2, 3, 4, 5, 6, 7, 8])

In [459]:
arr2.copy()

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [461]:
# Mathematical & Statistical Methods
arr3 = np.array([1, 3, 6, 9, 12, 15])
arr3

array([ 1,  3,  6,  9, 12, 15])

In [463]:
arr3.sum()

46

In [465]:
arr3.min()

1

In [467]:
arr3.max()

15

In [469]:
arr3.mean()

7.666666666666667

In [471]:
arr3.std()

4.887626099538393

In [473]:
arr3.var()

23.888888888888886

In [475]:
arr3.prod()

29160

In [477]:
arr3.cumsum()

array([ 1,  4, 10, 19, 31, 46])

In [479]:
arr.cumprod()

array([  1,   2,   6,  24, 120, 720])

In [481]:
# Sorting & Searching Methods
arr4 = np.array([4,1,5,7,9,0,12,3,40])
arr4

array([ 4,  1,  5,  7,  9,  0, 12,  3, 40])

In [483]:
np.sort(arr4)   # This change will apply on arr4

array([ 0,  1,  3,  4,  5,  7,  9, 12, 40])

In [560]:
arr5 = np.array([[7,8,5],
                 [4,9,3]])
arr5

array([[7, 8, 5],
       [4, 9, 3]])

In [562]:
arr5.sort() #sorted by row - default
arr5

array([[5, 7, 8],
       [3, 4, 9]])

In [558]:
arr5.sort(axis=0)
arr5

array([[4, 8, 3],
       [7, 9, 5]])

In [485]:
np.argsort(arr4)

array([5, 1, 7, 0, 2, 3, 4, 6, 8])

In [487]:
np.where(arr4 > 8)

(array([4, 6, 8]),)

<div>
    <h2> Basic Properties of NumPy Methods 
    </h2>
</div>

In [494]:
arr1 =np.array([1,2,3])
arr1

array([1, 2, 3])

In [496]:
np.zeros((2,3)) #create an array of zeros  

array([[0., 0., 0.],
       [0., 0., 0.]])

In [498]:
np.ones((3,2))

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

In [500]:
np.sum(arr1)

6

In [502]:
np.mean(arr1)

2.0

In [504]:
np.add(arr4,100)   # not permanent

array([104, 101, 105, 107, 109, 100, 112, 103, 140])

In [506]:
np.subtract(arr4,1)

array([ 3,  0,  4,  6,  8, -1, 11,  2, 39])

In [508]:
np.multiply(arr4, 2)

array([ 8,  2, 10, 14, 18,  0, 24,  6, 80])

In [510]:
np.divide(arr4, 2)

array([ 2. ,  0.5,  2.5,  3.5,  4.5,  0. ,  6. ,  1.5, 20. ])

In [512]:
np.power(arr4, 2)

array([  16,    1,   25,   49,   81,    0,  144,    9, 1600])

In [514]:
print(np.random.randint(1, 10, 5)) # 5 random integers from 1 to 9
print(np.random.rand(3))           # 3 random floats (0 to 1)
print(np.random.normal(0, 1, 5))   # 5 random numbers from normal distribution

[3 3 7 2 4]
[0.49111893 0.78002776 0.41092437]
[-1.07746533  0.23848917  1.67960037 -1.30580313 -1.13889525]


In [516]:
A = np.array([[1,2],[3,4]])
B = np.array([[5,6],[7,8]])
np.dot(A,B)

array([[19, 22],
       [43, 50]])

In [536]:
np.multiply(A,B)

array([[ 5, 12],
       [21, 32]])

In [518]:
x = np.array([1, 2, 3])
z = np.array([4, 5, 6])

In [520]:
np.vstack([x,z])

array([[1, 2, 3],
       [4, 5, 6]])

In [522]:
np.hstack([x,z])

array([1, 2, 3, 4, 5, 6])

In [524]:
arr5 = np.array([[1, 2, 3, 4],
                 [5, 6, 7, 8],
                 [5, 9, 0, 3]])

In [526]:
arr5[1]

array([5, 6, 7, 8])

In [528]:
arr5[0][1]

2

In [530]:
arr5[:2]

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [532]:
arr5[:,2:]

array([[3, 4],
       [7, 8],
       [0, 3]])

In [534]:
arr5 > 4

array([[False, False, False, False],
       [ True,  True,  True,  True],
       [ True,  True, False, False]])

In [213]:
arr5[arr5 > 4]

array([5, 6, 7, 8, 5, 9])

<div>
    <h2> 3.2 Pandas</h2>
</div>

<div>
    <h4>
        Pandas is a Python library used for data manipulation, analysis, and cleaning. It provides fast and flexible tools to work with structured data, such as tables (DataFrames) and series (1D data). </h4>
</div>

<p>
    Key Reasons to Use Pandas:
    <ul>
        <li> Handles large datasets efficiently</li>
        <li> Supports data cleaning and preprocessing</li>
        <li> Works well with NumPy, Matplotlib, and SQL</li>
        <li> Provides built-in statistical and mathematical functions</li>
        <li> Can read and write multiple file formats (CSV, Excel, SQL, JSON, etc.)</li>
    </ul>
</p>


In [566]:
import pandas as pd

<div>
    <h2>
        DataFrame
    </h2>
</div>

In [569]:
# Create a DataFrame from a Dictionary
data = {
    "name" : ["alice", "anna", "jack", "july"],
    "lastname" : ["kepit", "kia", "muler" , "summet"],
    "age" : [24, 28, 25, 31],
    "job" : ["student", "engineer", "doctor", "teacher"]
}
print(data)

{'name': ['alice', 'anna', 'jack', 'july'], 'lastname': ['kepit', 'kia', 'muler', 'summet'], 'age': [24, 28, 25, 31], 'job': ['student', 'engineer', 'doctor', 'teacher']}


In [666]:
df = pd.DataFrame(data)
df

Unnamed: 0,name,lastname,age,job
0,alice,kepit,24,student
1,anna,kia,28,engineer
2,jack,muler,25,doctor
3,july,summet,31,teacher


In [573]:
# DataFrame Properties (Attributes)
df.shape

(4, 4)

In [575]:
df.size

16

In [577]:
df.dtypes

name        object
lastname    object
age          int64
job         object
dtype: object

In [230]:
df.columns

Index(['name', 'lastname', 'age', 'job'], dtype='object')

In [232]:
df.index

RangeIndex(start=0, stop=4, step=1)

In [581]:
df.head()

Unnamed: 0,name,lastname,age,job
0,alice,kepit,24,student
1,anna,kia,28,engineer
2,jack,muler,25,doctor
3,july,summet,31,teacher


In [583]:
# Access a Single Column
df[["name"]]

Unnamed: 0,name
0,alice
1,anna
2,jack
3,july


In [585]:
df[["name", "job"]]

Unnamed: 0,name,job
0,alice,student
1,anna,engineer
2,jack,doctor
3,july,teacher


In [668]:
# loc & iloc --- work on rows
df.loc[[1]]

Unnamed: 0,name,lastname,age,job
1,anna,kia,28,engineer


In [672]:
df.iloc[[1]]

Unnamed: 0,name,lastname,age,job
1,anna,kia,28,engineer


In [599]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   name      4 non-null      object
 1   lastname  4 non-null      object
 2   age       4 non-null      int64 
 3   job       4 non-null      object
dtypes: int64(1), object(3)
memory usage: 260.0+ bytes


In [601]:
df.describe()

Unnamed: 0,age
count,4.0
mean,27.0
std,3.162278
min,24.0
25%,24.75
50%,26.5
75%,28.75
max,31.0


In [603]:
df.sort_values(by = "age")

Unnamed: 0,name,lastname,age,job
0,alice,kepit,24,student
2,jack,muler,25,doctor
1,anna,kia,28,engineer
3,july,summet,31,teacher


In [605]:
df[df["age"] > 26]

Unnamed: 0,name,lastname,age,job
1,anna,kia,28,engineer
3,july,summet,31,teacher


In [607]:
# Adding a New Column
df["sex"] = ["F", "F", "M", "F" ]
df

Unnamed: 0,name,lastname,age,job,sex
0,alice,kepit,24,student,F
1,anna,kia,28,engineer,F
2,jack,muler,25,doctor,M
3,july,summet,31,teacher,F


In [609]:
# Dropping a column
df.drop(columns = ['age'])

Unnamed: 0,name,lastname,job,sex
0,alice,kepit,student,F
1,anna,kia,engineer,F
2,jack,muler,doctor,M
3,july,summet,teacher,F


In [611]:
# Dropping a row
df.drop(index = 1)

Unnamed: 0,name,lastname,age,job,sex
0,alice,kepit,24,student,F
2,jack,muler,25,doctor,M
3,july,summet,31,teacher,F


In [619]:
df['index'] = ['num1','num2','num3','num4']
df

Unnamed: 0,name,lastname,age,job,sex,index
0,alice,kepit,24,student,F,num1
1,anna,kia,28,engineer,F,num2
2,jack,muler,25,doctor,M,num3
3,july,summet,31,teacher,F,num4


In [621]:
df.set_index('index', inplace=True)

In [623]:
df

Unnamed: 0_level_0,name,lastname,age,job,sex
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
num1,alice,kepit,24,student,F
num2,anna,kia,28,engineer,F
num3,jack,muler,25,doctor,M
num4,july,summet,31,teacher,F


In [625]:
df.reset_index(inplace = True)

In [627]:
df

Unnamed: 0,index,name,lastname,age,job,sex
0,num1,alice,kepit,24,student,F
1,num2,anna,kia,28,engineer,F
2,num3,jack,muler,25,doctor,M
3,num4,july,summet,31,teacher,F


In [635]:
df.rename(columns = {'index':'num'}, inplace = True)

In [637]:
df

Unnamed: 0,num,name,lastname,age,job,sex
0,num1,alice,kepit,24,student,F
1,num2,anna,kia,28,engineer,F
2,num3,jack,muler,25,doctor,M
3,num4,july,summet,31,teacher,F


In [645]:
df.index = [101,102,103,104]

In [647]:
df

Unnamed: 0,num,name,lastname,age,job,sex
101,num1,alice,kepit,24,student,F
102,num2,anna,kia,28,engineer,F
103,num3,jack,muler,25,doctor,M
104,num4,july,summet,31,teacher,F


<div>
    <h3> some exercises
    </h3>
</div>

In [650]:
df = pd.DataFrame(np.arange(12).reshape(3,4), columns = ["G1", "G2", "G3", "G4"])
df

Unnamed: 0,G1,G2,G3,G4
0,0,1,2,3
1,4,5,6,7
2,8,9,10,11


In [652]:
df.drop([0]) # from rows

Unnamed: 0,G1,G2,G3,G4
1,4,5,6,7
2,8,9,10,11


In [654]:
df.drop(["G2"], axis =1) # from columns

Unnamed: 0,G1,G3,G4
0,0,2,3
1,4,6,7
2,8,10,11


In [656]:
df.insert(4, "G5", [4,6,7])
df

Unnamed: 0,G1,G2,G3,G4,G5
0,0,1,2,3,4
1,4,5,6,7,6
2,8,9,10,11,7


In [658]:
df["G1"]

0    0
1    4
2    8
Name: G1, dtype: int64

In [660]:
df[["G1", "G5"]]

Unnamed: 0,G1,G5
0,0,4
1,4,6
2,8,7


In [271]:
df.loc[1,]

G1    4
G2    5
G3    6
G4    7
G5    6
Name: 1, dtype: int64

In [662]:
df.loc[1,"G5"]

6

In [274]:
df.loc[[1,]]

Unnamed: 0,G1,G2,G3,G4,G5
1,4,5,6,7,6


In [277]:
df.loc[[0, 1, 2], ["G1", "G2"]]

Unnamed: 0,G1,G2
0,0,1
1,4,5
2,8,9


In [297]:
df.iloc[[0,1], [0,1]]

Unnamed: 0,G1,G2
0,0,1
1,4,5


In [688]:
df2 = pd.DataFrame(np.arange(16).reshape(4,4), columns= ["C1" , "C2", "C3", "C4" ])
df2

Unnamed: 0,C1,C2,C3,C4
0,0,1,2,3
1,4,5,6,7
2,8,9,10,11
3,12,13,14,15


In [690]:
df2.sort_values(["C2"]) 

Unnamed: 0,C1,C2,C3,C4
0,0,1,2,3
1,4,5,6,7
2,8,9,10,11
3,12,13,14,15


In [692]:
df2[["C2"]] ==1 

Unnamed: 0,C2
0,True
1,False
2,False
3,False


In [694]:
df2[df2["C2"]==1]

Unnamed: 0,C1,C2,C3,C4
0,0,1,2,3


In [696]:
df2[(df2["C2"]==1) | (df2["C2"]==5)]

Unnamed: 0,C1,C2,C3,C4
0,0,1,2,3
1,4,5,6,7


In [698]:
df2[(df2["C2"]==1) & (df2["C1"]==0)]

Unnamed: 0,C1,C2,C3,C4
0,0,1,2,3


In [700]:
x = np.logical_or(df2["C1"] == 4, df2["C2"]==0)

In [702]:
df2[x]

Unnamed: 0,C1,C2,C3,C4
1,4,5,6,7


In [704]:
df2[np.logical_and(df2["C2"]==13,df2["C4"]==15)]

Unnamed: 0,C1,C2,C3,C4
3,12,13,14,15


In [708]:
#.isin
df2[df2["C2"].isin([1,5,13])]


Unnamed: 0,C1,C2,C3,C4
0,0,1,2,3
1,4,5,6,7
3,12,13,14,15


In [710]:
data = np.random.rand(10,5)
data

array([[0.69475518, 0.5939024 , 0.63179202, 0.44025718, 0.08372648],
       [0.71233018, 0.42786349, 0.2977805 , 0.49208478, 0.74029639],
       [0.35772892, 0.41720995, 0.65472131, 0.37380143, 0.23451288],
       [0.98799529, 0.76599595, 0.77700444, 0.02798196, 0.17390652],
       [0.15408224, 0.07708648, 0.8898657 , 0.7503787 , 0.69340324],
       [0.51176338, 0.46426806, 0.56843069, 0.30254945, 0.49730879],
       [0.68326291, 0.91669867, 0.10892895, 0.49549179, 0.23283593],
       [0.43686066, 0.75154299, 0.48089213, 0.79772841, 0.28270293],
       [0.43341824, 0.00975735, 0.34079598, 0.68927201, 0.86936929],
       [0.26780382, 0.45674792, 0.26828131, 0.8370528 , 0.27051466]])

In [712]:
df = pd.DataFrame(data, [2020, 2021, 2022, 2023, 2024, 2025, 2026, 2027, 2028, 2029], ["Canada", "USA", "China", "Italy", "France"])
df

Unnamed: 0,Canada,USA,China,Italy,France
2020,0.694755,0.593902,0.631792,0.440257,0.083726
2021,0.71233,0.427863,0.29778,0.492085,0.740296
2022,0.357729,0.41721,0.654721,0.373801,0.234513
2023,0.987995,0.765996,0.777004,0.027982,0.173907
2024,0.154082,0.077086,0.889866,0.750379,0.693403
2025,0.511763,0.464268,0.568431,0.302549,0.497309
2026,0.683263,0.916699,0.108929,0.495492,0.232836
2027,0.436861,0.751543,0.480892,0.797728,0.282703
2028,0.433418,0.009757,0.340796,0.689272,0.869369
2029,0.267804,0.456748,0.268281,0.837053,0.270515


In [714]:
df["Hungery"] = np.random.rand(10,1) #how to add a column

In [716]:
df

Unnamed: 0,Canada,USA,China,Italy,France,Hungery
2020,0.694755,0.593902,0.631792,0.440257,0.083726,0.530062
2021,0.71233,0.427863,0.29778,0.492085,0.740296,0.175373
2022,0.357729,0.41721,0.654721,0.373801,0.234513,0.314966
2023,0.987995,0.765996,0.777004,0.027982,0.173907,0.891109
2024,0.154082,0.077086,0.889866,0.750379,0.693403,0.180336
2025,0.511763,0.464268,0.568431,0.302549,0.497309,0.494316
2026,0.683263,0.916699,0.108929,0.495492,0.232836,0.212298
2027,0.436861,0.751543,0.480892,0.797728,0.282703,0.520877
2028,0.433418,0.009757,0.340796,0.689272,0.869369,0.1601
2029,0.267804,0.456748,0.268281,0.837053,0.270515,0.919057


In [718]:
df["Canada"].mean()

0.5240000818247706

In [720]:
df.mean()

Canada     0.524000
USA        0.488107
China      0.501849
Italy      0.520660
France     0.407858
Hungery    0.439849
dtype: float64

In [722]:
df.mean(axis=1)

2020    0.495749
2021    0.474288
2022    0.392157
2023    0.603999
2024    0.457525
2025    0.473106
2026    0.441586
2027    0.545101
2028    0.417119
2029    0.503243
dtype: float64

In [728]:
#how to build a DataFrame with a Dictionary
pd2 = pd.DataFrame({
    "name": ["A", "M", "L", "J", "I", "V", "D", "F"],
    "color": ["Red", "Black", "Green","Red", "Black","Black", "Green","Red"],
    "age": [22, 34, 20, 22, 28, 20, 30, 23],
    "country": ["China", "France", "USA","China", "France", "USA", "USA", "France"]
})

In [730]:
pd2

Unnamed: 0,name,color,age,country
0,A,Red,22,China
1,M,Black,34,France
2,L,Green,20,USA
3,J,Red,22,China
4,I,Black,28,France
5,V,Black,20,USA
6,D,Green,30,USA
7,F,Red,23,France
