### pandas DataFrame Labels as Sequences

We can get the DataFrame’s row labels with ``.index`` and its column labels with ``.columns`.

In [2]:

import os
import pandas as pd 


directory = 'resources'

if not os.path.exists(directory):
    os.makedirs(directory)

file_path = os.path.join(directory,'candidate.csv')

df = pd.read_csv(file_path,index_col=0)

print(df)


print(f'Row Labels : {df.index}')
print(f'Columns Labels : {df.columns}')

       name         city  age  py-score
101  Xavier  Mexico City   41      88.0
102     Ann      Toronto   28      79.0
103    Jana       Prague   33      81.0
104      Yi     Shanghai   34      80.0
105   Robin   Manchester   38      68.0
106    Amal        Cairo   31      61.0
107    Nori        Osaka   37      84.0
Row Labels : Index([101, 102, 103, 104, 105, 106, 107], dtype='int64')
Columns Labels : Index(['name', 'city', 'age', 'py-score'], dtype='object')


``row`` and ``column`` labels as special kinds of sequences. 

As we can with any other Python sequence, you can get a single item like ``df.index[1] or df.columns[1]``.

In addition to extracting a particular item, we can apply other sequence operations, including iterating through the labels of rows or columns.

In [4]:
import numpy as np

print('Row Labels:->')

for row_label in df.index:
    print(f'{row_label}',sep = ',',end =' ')
    
# We can also use this approach to modify the labels:

row_labels = np.arange(100,107)

df.index = row_labels

print()
print('New Row Labels:->')

for row_label in df.index:
    print(f'{row_label}',sep = ',',end =' ')
    
print('Dataframe')
print(df)
    
    

Row Labels:->
101 102 103 104 105 106 107 
New Row Labels:->
100 101 102 103 104 105 106 Dataframe
       name         city  age  py-score
100  Xavier  Mexico City   41      88.0
101     Ann      Toronto   28      79.0
102    Jana       Prague   33      81.0
103      Yi     Shanghai   34      80.0
104   Robin   Manchester   38      68.0
105    Amal        Cairo   31      61.0
106    Nori        Osaka   37      84.0


### DataFrame as NumPy Arrays

To extract data from a pandas DataFrame without its label as NumPy array with the unlabeled data, we can use either ``.to_numpy()`` or ``.values``.

Both ``.to_numpy()`` and ``.values`` work similarly, and they both return a NumPy array with the data from the pandas DataFrame:

![image.png](attachment:image.png)

The pandas documentation suggests using ``.to_numpy()`` because of the flexibility offered by two optional parameters:

1. **dtype:** Use this parameter to specify the data type of the resulting array. It’s set to None by default.
copy: 

2. **copy:** Set this parameter to ``False`` if we want to use the original data from the DataFrame. Set it to ``True`` to make a copy of the data.

However, ``.values`` has been around for much longer than ``.to_numpy()``.

In [5]:
print('Extract DataFrame Data as Numpy Array using .to_numpy method')

print(df.to_numpy())

print('Extract DataFrame Data as Numpy Array using .values property')

print(df.values)

Extract DataFrame Data as Numpy Array using .to_numpy method
[['Xavier' 'Mexico City' 41 88.0]
 ['Ann' 'Toronto' 28 79.0]
 ['Jana' 'Prague' 33 81.0]
 ['Yi' 'Shanghai' 34 80.0]
 ['Robin' 'Manchester' 38 68.0]
 ['Amal' 'Cairo' 31 61.0]
 ['Nori' 'Osaka' 37 84.0]]
Extract DataFrame Data as Numpy Array using .values property
[['Xavier' 'Mexico City' 41 88.0]
 ['Ann' 'Toronto' 28 79.0]
 ['Jana' 'Prague' 33 81.0]
 ['Yi' 'Shanghai' 34 80.0]
 ['Robin' 'Manchester' 38 68.0]
 ['Amal' 'Cairo' 31 61.0]
 ['Nori' 'Osaka' 37 84.0]]


### Data Types

We can get the data types for each column of a pandas DataFrame with ``.dtypes``.

``.dtypes`` returns a Series object with the column names as labels and the corresponding data types as values.

To modify the data type of one or more columns, use ``.astype()``



In [6]:
print('Fetching Data Types of DataFrame')

print(df.dtypes)

print('Modifying the Data Types Of Columns')

df = df.astype(dtype={'age': np.int32, 'py-score': np.float32})

print(df.dtypes)

Fetching Data Types of DataFrame
name         object
city         object
age           int64
py-score    float64
dtype: object
Modifying the Data Types Of Columns
name         object
city         object
age           int32
py-score    float32
dtype: object


>Note:The most important and only mandatory parameter of .astype() is dtype. It expects a data type or dictionary.


### pandas DataFrame Size (.ndim,.shape and .size)

The attributes ``.ndim``, ``.shape``, and ``.size`` return the number of dimensions, number of data values across each dimension, and total number of data values, respectively:

-   ``.ndim`` returns dimensions of DataFrame instance which is 2. A series object has ``.ndim`` value -> 1

-   ``.shape`` attribute returns a tuple with the number of rows and the number of columns.

-   ``.size`` returns an integer equal to the number of values in the DataFrame

In [7]:
print(f'Dimension of DataFrame Instance df is : {df.ndim}')

print(f'Shape of DataFrame Instance df is : {df.shape}')

print(f'Size of DataFrame Instance df is : {df.size}')

Dimension of DataFrame Instance df is : 2
Shape of DataFrame Instance df is : (7, 4)
Size of DataFrame Instance df is : 28
