# `pandas` DataFrames

## Setup

In [1]:
import pandas as pd

In [2]:
# Warning: this command can leak some private information!
pd

<module 'pandas' from 'C:\\Users\\Bea\\anaconda3\\lib\\site-packages\\pandas\\__init__.py'>

In [3]:
pd.__version__

'1.2.4'

## Creation

Creation of an example DataFrame (starting from a dictionary of dictionaries):

In [4]:
data = {
    "Capital": {
        "Spain": "Madrid",
        "Belgium": "Brussels",
        "France": "Paris",
        "Italy": "Roma",
        "Germany": "Berlin",
        "Portugal": "Lisbon",
        "Norway": "Oslo",
        "Greece": "Athens",
    },
    "Population": {
        "Spain": 46733038,
        "Belgium": 11449656,
        "France": 67076000,
        "Italy": 60390560,
        "Germany": 83122889,
        "Portugal": 10295909,
        "Norway": 5391369,
        "Greece": 10718565,
    },
    "Monarch": {
        "Spain": "Felipe VI",
        "Belgium": "Philippe",
        "Norway": "Harald V",
    },
    "Area": {
        "Spain": 505990,
        "Belgium": 30688,
        "France": 640679,
        "Italy": 301340,
        "Germany": 357022,
        "Portugal": 92212,
        "Norway": 385207,
        "Greece": 131957,
    },
}

In [5]:
# For now, let's forget about these steps:
df = pd.DataFrame(data)
df["Capital"] = df["Capital"].astype("string")
df["Monarch"] = df["Monarch"].astype("string")

Apple stock data, taken from the [`matplotlib` sample datasets](https://github.com/matplotlib/sample_data/blob/master/aapl.csv)

In [6]:
# For now, let's forget about these steps:
apple = pd.read_csv("AAPL.csv")
apple["Date"] = apple["Date"].astype("datetime64[ns]")
apple = apple.set_index("Date")
apple = apple.sort_index()

## Demo 1: Anatomy of a `pandas` DataFrame

Check the DataFrame:

In [7]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Check the type of the DataFrame:

In [8]:
type(df)

pandas.core.frame.DataFrame

A DataFrame has:
- **column names** (shown in bold)
- **row labels** (shown in bold; also called the **index**)

Ideally, **each column contains elements of the same data type** (e.g. strings, integers, floats, booleans, or dates).

Some data may be missing and are indicated by `<NA>`, or a variation thereof.

In a DataFrame, both **columns & rows are ordered**.

## Exercise 1

Check the DataFrame:

In [9]:
apple

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.50,26.87,26.25,26.50,2981600,3.02
1984-09-10,26.50,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.00,26.12,26.12,4773600,2.98
1984-09-13,27.50,27.62,27.50,27.50,7429600,3.14
...,...,...,...,...,...,...
2008-10-08,85.91,96.33,85.68,89.79,78847900,89.79
2008-10-09,93.35,95.80,86.60,88.74,57763700,88.74
2008-10-10,85.70,100.00,85.00,96.80,79260700,96.80
2008-10-13,104.55,110.53,101.02,110.26,54967000,110.26


Check the type of the DataFrame:

In [10]:
type(apple)

pandas.core.frame.DataFrame

## Demo 2: View a `DataFrame`

In [11]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


View a DataFrame without the fancy HTML representation:

In [12]:
print(df)

           Capital  Population    Monarch    Area
Spain       Madrid    46733038  Felipe VI  505990
Belgium   Brussels    11449656   Philippe   30688
France       Paris    67076000       <NA>  640679
Italy         Roma    60390560       <NA>  301340
Germany     Berlin    83122889       <NA>  357022
Portugal    Lisbon    10295909       <NA>   92212
Norway        Oslo     5391369   Harald V  385207
Greece      Athens    10718565       <NA>  131957


Check the first 5 rows of the DataFrame:

In [13]:
df.head()

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022


Check the first 3 rows of the DataFrame:

In [14]:
df.head(3)

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679


Check the last 5 rows of the DataFrame:

In [15]:
df.tail()

Unnamed: 0,Capital,Population,Monarch,Area
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Check the last 2 rows of the DataFrame:

In [16]:
df.tail(2)

Unnamed: 0,Capital,Population,Monarch,Area
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


## Exercise 2

In [17]:
apple

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.50,26.87,26.25,26.50,2981600,3.02
1984-09-10,26.50,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.00,26.12,26.12,4773600,2.98
1984-09-13,27.50,27.62,27.50,27.50,7429600,3.14
...,...,...,...,...,...,...
2008-10-08,85.91,96.33,85.68,89.79,78847900,89.79
2008-10-09,93.35,95.80,86.60,88.74,57763700,88.74
2008-10-10,85.70,100.00,85.00,96.80,79260700,96.80
2008-10-13,104.55,110.53,101.02,110.26,54967000,110.26


Check the first 5 rows of the DataFrame:

In [18]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Check the first 10 rows of the DataFrame:

In [19]:
apple.head(10)

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14
1984-09-14,27.62,28.5,27.62,27.87,8826400,3.18
1984-09-17,28.62,29.0,28.62,28.62,6886400,3.27
1984-09-18,28.62,28.87,27.62,27.62,3495200,3.15
1984-09-19,27.62,27.87,27.0,27.0,3816000,3.08
1984-09-20,27.12,27.37,27.12,27.12,2387200,3.09


Check the last 5 rows of the DataFrame:

In [20]:
apple.tail()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2008-10-08,85.91,96.33,85.68,89.79,78847900,89.79
2008-10-09,93.35,95.8,86.6,88.74,57763700,88.74
2008-10-10,85.7,100.0,85.0,96.8,79260700,96.8
2008-10-13,104.55,110.53,101.02,110.26,54967000,110.26
2008-10-14,116.26,116.4,103.14,104.08,70749800,104.08


Check the last 8 rows of the DataFrame:

In [21]:
apple.tail(8)

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2008-10-03,104.0,106.5,94.65,97.07,81942800,97.07
2008-10-06,91.96,98.78,87.54,98.14,75264900,98.14
2008-10-07,100.48,101.5,88.95,89.16,67099000,89.16
2008-10-08,85.91,96.33,85.68,89.79,78847900,89.79
2008-10-09,93.35,95.8,86.6,88.74,57763700,88.74
2008-10-10,85.7,100.0,85.0,96.8,79260700,96.8
2008-10-13,104.55,110.53,101.02,110.26,54967000,110.26
2008-10-14,116.26,116.4,103.14,104.08,70749800,104.08


## Demo 3: Shape

Check the shape:

In [22]:
df.shape

(8, 4)

Check the length:

In [23]:
len(df)

8

## Exercise 3

In [24]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Check the shape:

In [25]:
apple.shape

(6081, 6)

Check the length:

In [26]:
len(apple)

6081

## Demo 4: Columns & Index

Check the columns:

In [30]:
df.columns

Index(['Capital', 'Population', 'Monarch', 'Area'], dtype='object')

Check the type of the columns:

In [28]:
type(df.columns)

pandas.core.indexes.base.Index

Check the index (i.e. the row labels):

In [31]:
df.index

Index(['Spain', 'Belgium', 'France', 'Italy', 'Germany', 'Portugal', 'Norway',
       'Greece'],
      dtype='object')

Check the type of the index:

In [32]:
type(df.index)

pandas.core.indexes.base.Index

<div class="alert alert-info">

<b>Note:</b> Both the <code>columns</code> and the <code>index</code> are of the same <code>Index</code> type (because they have similar behaviours)

</div>

## Exercise 4

In [33]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Check the columns:

In [34]:
apple.columns

Index(['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close'], dtype='object')

Check the type of the columns:

In [35]:
type(apple.columns)

pandas.core.indexes.base.Index

Check the index:

In [36]:
apple.index

DatetimeIndex(['1984-09-07', '1984-09-10', '1984-09-11', '1984-09-12',
               '1984-09-13', '1984-09-14', '1984-09-17', '1984-09-18',
               '1984-09-19', '1984-09-20',
               ...
               '2008-10-01', '2008-10-02', '2008-10-03', '2008-10-06',
               '2008-10-07', '2008-10-08', '2008-10-09', '2008-10-10',
               '2008-10-13', '2008-10-14'],
              dtype='datetime64[ns]', name='Date', length=6081, freq=None)

Check the type of the index:

In [37]:
type(apple.index)

pandas.core.indexes.datetimes.DatetimeIndex

## Demo 5: Data types

Check the data type of each of the columns:

In [38]:
df.dtypes

Capital       string
Population     int64
Monarch       string
Area           int64
dtype: object

## Exercise 5

In [39]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Check the data type of each of the columns:

In [40]:
apple.dtypes

Open         float64
High         float64
Low          float64
Close        float64
Volume         int64
Adj Close    float64
dtype: object

## Demo 6: Missing data

Check if there is any missing data:

In [41]:
df.isnull()

Unnamed: 0,Capital,Population,Monarch,Area
Spain,False,False,False,False
Belgium,False,False,False,False
France,False,False,True,False
Italy,False,False,True,False
Germany,False,False,True,False
Portugal,False,False,True,False
Norway,False,False,False,False
Greece,False,False,True,False


In [42]:
df.notnull()

Unnamed: 0,Capital,Population,Monarch,Area
Spain,True,True,True,True
Belgium,True,True,True,True
France,True,True,False,True
Italy,True,True,False,True
Germany,True,True,False,True
Portugal,True,True,False,True
Norway,True,True,True,True
Greece,True,True,False,True


Count the number of non-missing values in each column:

In [43]:
df.count()

Capital       8
Population    8
Monarch       3
Area          8
dtype: int64

## Exercise 6

In [44]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Check if there is any missing data:

In [45]:
apple.isnull()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,False,False,False,False,False,False
1984-09-10,False,False,False,False,False,False
1984-09-11,False,False,False,False,False,False
1984-09-12,False,False,False,False,False,False
1984-09-13,False,False,False,False,False,False
...,...,...,...,...,...,...
2008-10-08,False,False,False,False,False,False
2008-10-09,False,False,False,False,False,False
2008-10-10,False,False,False,False,False,False
2008-10-13,False,False,False,False,False,False


In [46]:
apple.notnull()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,True,True,True,True,True,True
1984-09-10,True,True,True,True,True,True
1984-09-11,True,True,True,True,True,True
1984-09-12,True,True,True,True,True,True
1984-09-13,True,True,True,True,True,True
...,...,...,...,...,...,...
2008-10-08,True,True,True,True,True,True
2008-10-09,True,True,True,True,True,True
2008-10-10,True,True,True,True,True,True
2008-10-13,True,True,True,True,True,True


Count the number of non-missing values in each column:

In [47]:
apple.count()

Open         6081
High         6081
Low          6081
Close        6081
Volume       6081
Adj Close    6081
dtype: int64

## Demo 7: `name` attributes

Check the `name` attribute of the DataFrame:

In [48]:
# Raises an error, because no name has been defined for the DataFrame:
df.name

AttributeError: 'DataFrame' object has no attribute 'name'

Check the `name` attribute of the index:

In [49]:
df.index.name

Check the `name` attribute of the column:

In [50]:
df.columns.name

Set the name of the DataFrame:

In [51]:
df.name = "countries"

In [52]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


In [53]:
df.name

'countries'

Set the name of the index:

In [54]:
df.index.name = "country"

In [55]:
df

Unnamed: 0_level_0,Capital,Population,Monarch,Area
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


In [56]:
df.index.name

'country'

Set the name of the columns:

In [57]:
df.columns.name = "properties"

In [58]:
df

properties,Capital,Population,Monarch,Area
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


In [59]:
df.columns.name

'properties'

## Exercise 7

In [60]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Check the `name` attribute of the DataFrame:

In [61]:
apple.name

AttributeError: 'DataFrame' object has no attribute 'name'

Check the `name` attribute of the index:

In [62]:
apple.index.name

'Date'

Check the `name` attribute of the column:

In [63]:
apple.columns.name

Set the name of the DataFrame to "Apple Stock":

In [64]:
apple.name = "Apple Stock"

In [65]:
apple

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.50,26.87,26.25,26.50,2981600,3.02
1984-09-10,26.50,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.00,26.12,26.12,4773600,2.98
1984-09-13,27.50,27.62,27.50,27.50,7429600,3.14
...,...,...,...,...,...,...
2008-10-08,85.91,96.33,85.68,89.79,78847900,89.79
2008-10-09,93.35,95.80,86.60,88.74,57763700,88.74
2008-10-10,85.70,100.00,85.00,96.80,79260700,96.80
2008-10-13,104.55,110.53,101.02,110.26,54967000,110.26


In [66]:
apple.name

'Apple Stock'

Set the name of the columns to "values":

In [67]:
apple.columns.name = "values"

In [68]:
apple

values,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.50,26.87,26.25,26.50,2981600,3.02
1984-09-10,26.50,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.00,26.12,26.12,4773600,2.98
1984-09-13,27.50,27.62,27.50,27.50,7429600,3.14
...,...,...,...,...,...,...
2008-10-08,85.91,96.33,85.68,89.79,78847900,89.79
2008-10-09,93.35,95.80,86.60,88.74,57763700,88.74
2008-10-10,85.70,100.00,85.00,96.80,79260700,96.80
2008-10-13,104.55,110.53,101.02,110.26,54967000,110.26


In [69]:
apple.columns.name

'values'

## Demo 8: Underlying values (`numpy` arrays)

Check the underlying values:

In [70]:
df.values

array([['Madrid', 46733038, 'Felipe VI', 505990],
       ['Brussels', 11449656, 'Philippe', 30688],
       ['Paris', 67076000, <NA>, 640679],
       ['Roma', 60390560, <NA>, 301340],
       ['Berlin', 83122889, <NA>, 357022],
       ['Lisbon', 10295909, <NA>, 92212],
       ['Oslo', 5391369, 'Harald V', 385207],
       ['Athens', 10718565, <NA>, 131957]], dtype=object)

Note that the underlying values are stored in `numpy` arrays:

In [71]:
type(df.values)

numpy.ndarray

## Exercise 8

In [72]:
apple.head()

values,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Check the underlying values:

In [73]:
apple.values

array([[2.65000e+01, 2.68700e+01, 2.62500e+01, 2.65000e+01, 2.98160e+06,
        3.02000e+00],
       [2.65000e+01, 2.66200e+01, 2.58700e+01, 2.63700e+01, 2.34640e+06,
        3.01000e+00],
       [2.66200e+01, 2.73700e+01, 2.66200e+01, 2.68700e+01, 5.44400e+06,
        3.07000e+00],
       ...,
       [8.57000e+01, 1.00000e+02, 8.50000e+01, 9.68000e+01, 7.92607e+07,
        9.68000e+01],
       [1.04550e+02, 1.10530e+02, 1.01020e+02, 1.10260e+02, 5.49670e+07,
        1.10260e+02],
       [1.16260e+02, 1.16400e+02, 1.03140e+02, 1.04080e+02, 7.07498e+07,
        1.04080e+02]])

Check the type of the underlying values:

In [74]:
type(apple.values)

numpy.ndarray

## `DataFrame` column

Select a column from a DataFrame:

In [75]:
s = df["Capital"]

In [76]:
s

country
Spain         Madrid
Belgium     Brussels
France         Paris
Italy           Roma
Germany       Berlin
Portugal      Lisbon
Norway          Oslo
Greece        Athens
Name: Capital, dtype: string

In [77]:
type(s)

pandas.core.series.Series

<div class="alert alert-info">

<b>Note:</b> The <code>Series</code> gets the same index as the <code>DataFrame</code>!

</div>