# pandas portfplio part zero
pandas is a powerful and widely-used Python library for data manipulation and analysis. It provides two essential data structures: `Series` and `DataFrame`.

### Series
A Series is a one-dimensional array-like object that can hold data of any type (integers, strings, floats, etc.). Each element in a Series has a unique identifier called an index. You can think of a Series as a single column in a spreadsheet or a database table. Series are useful for handling and analyzing data, and they support various operations such as indexing, slicing, and vectorized operations.

### DataFrame
A DataFrame is a two-dimensional, table-like data structure that consists of multiple Series. It is similar to a spreadsheet or a SQL table, with rows and columns. Each column in a DataFrame is a Series, and the DataFrame itself is a collection of these Series. DataFrames are highly versatile and can handle a wide variety of data manipulation tasks, such as filtering, grouping, and merging.

consider an example of studets scores like this:

In [3]:
import pandas as pd

In [4]:
scores = pd.Series([20, 19, 18, 6, 9])
scores

0    20
1    19
2    18
3     6
4     9
dtype: int64

here above, we got a simple serie, could be thought as a simple column.

---

now let's give its rows **names**, rather than the row indexes:

In [7]:
scores = pd.Series([20, 19, 18, 6, 9], index=['sam', 'pam', 'panam', 'jym', 'kim'])
scores

sam      20
pam      19
panam    18
jym       6
kim       9
dtype: int64

---

there could be a **name assignment** for the whole table.

In [30]:
scores = pd.Series([20, 19, 18, 6, 9], index=['sam', 'pam', 'panam', 'jym', 'kim'], name='score table')
scores

sam      20
pam      19
panam    18
jym       6
kim       9
Name: score table, dtype: int64

---
here is how to **access the vlaues** with different indexes:

In [31]:
scores['sam']
# scores[0]
# those above are both same ways to call the values of a spicified index.

20

---
there is a way to find the values by a **range of indexs** like below:

In [32]:
scores[0:3]

sam      20
pam      19
panam    18
Name: score table, dtype: int64

---
there could be **name assignment to the index column** like this:

In [33]:
scores.index.name = 'students'
scores

students
sam      20
pam      19
panam    18
jym       6
kim       9
Name: score table, dtype: int64

---
the **drop** function: this allows the list to drop a row, by calling its index.

In [34]:
scores = scores.drop('pam')
scores

students
sam      20
panam    18
jym       6
kim       9
Name: score table, dtype: int64

---
here is how to drom more that one row, using braces '[]':

In [35]:
scores = scores.drop(['sam', 'jym'])
scores

students
panam    18
kim       9
Name: score table, dtype: int64

---
here is what **pop** function going to do:

In [36]:
scores.pop('panam')

18

In [37]:
scores

students
kim    9
Name: score table, dtype: int64

quick question: what is the difference between pop and drop (and del)? short ans: pop is python built-in function while drop belongs to pandas. (search more on this.)

---
The **isin() method**: checks if the Dataframe contains the specified value(s).


In [38]:
scores = pd.Series([20, 19, 18, 6, 9], index=['sam', 'pam', 'panam', 'jym', 'kim'], name='score table')
scores.isin([20, 9])

sam       True
pam      False
panam    False
jym      False
kim       True
Name: score table, dtype: bool

---
the **copy()** function: simply makes a copy of a Serie (or DataFrame).

In [40]:
my_new_serie = scores.copy()
my_new_serie

sam      20
pam      19
panam    18
jym       6
kim       9
Name: score table, dtype: int64

---
how to **change values** of a specifyed index is below:

In [41]:
my_new_serie['sam']=0
my_new_serie

sam       0
pam      19
panam    18
jym       6
kim       9
Name: score table, dtype: int64

---
here is how **sorting** could be possible in the pandas:


In [43]:
scores.sort_values()

jym       6
kim       9
panam    18
pam      19
sam      20
Name: score table, dtype: int64

In [45]:
scores

sam      20
pam      19
panam    18
jym       6
kim       9
Name: score table, dtype: int64

In [49]:
scores.sort_values(ascending=False)

sam      20
pam      19
panam    18
kim       9
jym       6
Name: score table, dtype: int64

there is also another method related to sorting, which is **rank()** that indicates the rank of a value in the sorted column:

In [50]:
scores.rank()

sam      5.0
pam      4.0
panam    3.0
jym      1.0
kim      2.0
Name: score table, dtype: float64

---
should all the indexs be unique? the answer is no, there is no limits around usnig unique indexs. checkout this example:

In [52]:
best_players = pd.Series([19, 20, 18, 17], index=['ronaldo', 'messi', 'ronaldo', 'mbape'])
best_players

ronaldo    19
messi      20
ronaldo    18
mbape      17
dtype: int64

and if an index with more than one values is called, all would be called:

In [53]:
best_players['ronaldo']

ronaldo    19
ronaldo    18
dtype: int64

there is also a function named **is_unique**, which works like below:

In [55]:
best_players.index.is_unique

False

---
if statics of a Serie is required, then the function **describe()** could come in handy!

In [57]:
scores.describe()

count     5.000000
mean     14.400000
std       6.426508
min       6.000000
25%       9.000000
50%      18.000000
75%      19.000000
max      20.000000
Name: score table, dtype: float64