# Video: Digging into Pandas Data Frames

This video looks deeper into the details that Panda data frames manage for you.

Slide: How Pandas Describes Their Data Frames

> Two-dimensional, size-mutable, potentially heterogeneous tabular data.
>
> Data structure also contains labeled axes (rows and columns).
> Arithmetic operations align on both row and column labels.
> Can be thought of as a dict-like container for Series objects.
> The primary pandas data structure.

## Code Example: Inspecting Data Frames

In [None]:
import pandas as pd
abalone = pd.read_csv("https://raw.githubusercontent.com/bu-omds/bu-omds-data/main/data/abalone.tsv", sep="\t")

In [None]:
abalone.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4177 entries, 0 to 4176
Data columns (total 9 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Sex             4177 non-null   object 
 1   Length          4177 non-null   float64
 2   Diameter        4177 non-null   float64
 3   Height          4177 non-null   float64
 4   Whole_weight    4177 non-null   float64
 5   Shucked_weight  4177 non-null   float64
 6   Viscera_weight  4177 non-null   float64
 7   Shell_weight    4177 non-null   float64
 8   Rings           4177 non-null   int64  
dtypes: float64(7), int64(1), object(1)
memory usage: 293.8+ KB


In [None]:
abalone.columns

Index(['Sex', 'Length', 'Diameter', 'Height', 'Whole_weight', 'Shucked_weight',
       'Viscera_weight', 'Shell_weight', 'Rings'],
      dtype='object')

In [None]:
abalone["Length"]

0       0.455
1       0.350
2       0.530
3       0.440
4       0.330
        ...  
4172    0.565
4173    0.590
4174    0.600
4175    0.625
4176    0.710
Name: Length, Length: 4177, dtype: float64

In [None]:
type(abalone["Length"])

In [None]:
abalone["Length"].info()

<class 'pandas.core.series.Series'>
RangeIndex: 4177 entries, 0 to 4176
Series name: Length
Non-Null Count  Dtype  
--------------  -----  
4177 non-null   float64
dtypes: float64(1)
memory usage: 32.8 KB


In [None]:
abalone["Rings"].dtype

dtype('int64')

In [None]:
abalone.index

RangeIndex(start=0, stop=4177, step=1)

In [None]:
abalone.shape

(4177, 9)

In [None]:
abalone["Rings"].shape

(4177,)

In [None]:
abalone.size

37593

In [None]:
abalone["Rings"].size

4177

In [None]:
abalone.flags

<Flags(allows_duplicate_labels=True)>

In [None]:
abalone["Rings"].flags

<Flags(allows_duplicate_labels=True)>

In [None]:
abalone["Rings"].array

<PandasArray>
[15,  7,  9, 10,  7,  8, 20, 16,  9, 19,
 ...
  9,  8, 10, 10,  8, 11, 10,  9, 10, 12]
Length: 4177, dtype: int64

In [None]:
abalone["Rings"].to_numpy()

array([15,  7,  9, ...,  9, 10, 12])

In [None]:
abalone.to_numpy()

array([['M', 0.455, 0.365, ..., 0.101, 0.15, 15],
       ['M', 0.35, 0.265, ..., 0.0485, 0.07, 7],
       ['F', 0.53, 0.42, ..., 0.1415, 0.21, 9],
       ...,
       ['M', 0.6, 0.475, ..., 0.2875, 0.308, 9],
       ['F', 0.625, 0.485, ..., 0.261, 0.296, 10],
       ['M', 0.71, 0.555, ..., 0.3765, 0.495, 12]], dtype=object)