Series and DataFrame

DataFrame: A DataFrame is a two-dimensional, size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). It is similar to a table in a database or a spreadsheet.

Series: A Series is a one-dimensional array-like object that can hold any data type. It is similar to a column in a DataFrame.

1. Convert between a Series and a DataFrame column.
2. convert back and forward from python list, dict to Series and DataFrame.


In [10]:
import polars as pl
csv_file = './Files/Sample_Superstore-checkpoint.csv'

In [11]:
df = pl.read_csv(csv_file)

In [18]:
df

Row_ID,Order_ID,Order_Date,Ship_Date,Ship_Mode,Customer_ID,Customer_Name,Segment,Country,City,State,Postal_Code,Region,Product_ID,Category,Sub_Category,Product_Name,Sales,Quantity,Discount,Profit
i64,str,str,str,str,str,str,str,str,str,str,i64,str,str,str,str,str,f64,i64,f64,f64
1,,,"""11-11-2016""","""Second Class""","""CG-12520""","""Claire Gute""","""Consumer""","""United States""","""Henderson""","""Kentucky""",42420,"""South""","""FUR-BO-10001798""","""Furniture""","""Bookcases""","""Bush Somerset Collection Bookc…",261.96,2,0.0,41.9136
2,"""CA-2016-152156""","""08-11-2016""","""11-11-2016""","""Second Class""","""CG-12520""","""Claire Gute""","""Consumer""","""United States""","""Henderson""","""Kentucky""",42420,"""South""","""FUR-CH-10000454""","""Furniture""","""Chairs""","""Hon Deluxe Fabric Upholstered …",731.94,3,0.0,219.582
3,"""CA-2016-138688""","""12-06-2016""",,,"""DV-13045""","""Darrin Van Huff""","""Corporate""",,"""Los Angeles""","""California""",90036,"""West""","""OFF-LA-10000240""","""Office Supplies""","""Labels""","""Self-Adhesive Address Labels f…",14.62,2,0.0,6.8714
4,,"""11-10-2015""",,"""Standard Class""","""SO-20335""","""Sean O'Donnell""","""Consumer""","""United States""","""Fort Lauderdale""","""Florida""",33311,"""South""","""FUR-TA-10000577""","""Furniture""","""Tables""","""Bretford CR4500 Series Slim Re…",957.5775,5,0.45,-383.031
5,"""US-2015-108966""","""11-10-2015""","""18-10-2015""","""Standard Class""","""SO-20335""","""Sean O'Donnell""","""Consumer""","""United States""",,"""Florida""",33311,"""South""","""OFF-ST-10000760""","""Office Supplies""","""Storage""","""Eldon Fold 'N Roll Cart System""",22.368,2,0.2,2.5164
…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…,…
9990,"""CA-2014-110422""","""21-01-2014""","""23-01-2014""","""Second Class""","""TB-21400""","""Tom Boeckenhauer""","""Consumer""","""United States""","""Miami""","""Florida""",33180,"""South""","""FUR-FU-10001889""","""Furniture""","""Furnishings""","""Ultra Door Pull Handle""",25.248,3,0.2,4.1028
9991,"""CA-2017-121258""","""26-02-2017""","""03-03-2017""","""Standard Class""","""DB-13060""","""Dave Brooks""","""Consumer""","""United States""","""Costa Mesa""","""California""",92627,"""West""","""FUR-FU-10000747""","""Furniture""","""Furnishings""","""Tenex B1-RE Series Chair Mats …",91.96,2,0.0,15.6332
9992,"""CA-2017-121258""","""26-02-2017""","""03-03-2017""","""Standard Class""","""DB-13060""","""Dave Brooks""","""Consumer""","""United States""","""Costa Mesa""","""California""",92627,"""West""","""TEC-PH-10003645""","""Technology""","""Phones""","""Aastra 57i VoIP phone""",258.576,2,0.2,19.3932
9993,"""CA-2017-121258""","""26-02-2017""","""03-03-2017""","""Standard Class""","""DB-13060""","""Dave Brooks""","""Consumer""","""United States""","""Costa Mesa""","""California""",92627,"""West""","""OFF-PA-10004041""","""Office Supplies""","""Paper""","""It's Hot Message Books with St…",29.6,4,0.0,13.32


In [19]:
type(df.head(3))

polars.dataframe.frame.DataFrame

Converting between a Series and a DataFrame column

We can create a Series from a DataFrame column with square brackets. This is useful when you want to extract a single column from a DataFrame as a Series for further processing or analysis.

In [14]:
df["Profit"].head(3)

Profit
f64
41.9136
219.582
6.8714


In [15]:
type(df["Profit"].head(3))

polars.series.series.Series

We can also create a Series into a one-column DataFrame by using the `to_frame` method. This is useful when you want to convert a Series into a DataFrame for further processing or analysis.

In [16]:
df.select("Profit").to_series().head(3)

Profit
f64
41.9136
219.582
6.8714


In [17]:
type(df.select("Profit").to_series().head(3))

polars.series.series.Series

We can convert a Series into a one-column DataFrame by using the `to_frame` method. This is useful when you want to convert a Series into a DataFrame for further processing or analysis.

In [22]:
s = df["Customer_Name"]
s.to_frame().head(3)
# type(s.to_frame().head(3))

Customer_Name
str
"""Claire Gute"""
"""Claire Gute"""
"""Darrin Van Huff"""


Create a Series or DataFrame from a list or dict

We can creatre a Series or DataFrame from a list or dict. This is useful when you want to create a Series or DataFrame from a list or dict for further processing or analysis.

In [23]:
value = [1, 2, 3, 4, 5]

In [24]:
value

[1, 2, 3, 4, 5]

In [25]:
type(value)

list

In [26]:
pl.Series(value)

1
2
3
4
5


In [27]:
type(pl.Series(value))

polars.series.series.Series

if the namme argument is not set then it defaults to an empty string. This is useful when you want to create a Series or DataFrame from a list or dict without specifying a name.

In [28]:
pl.Series('vals', value)

vals
i64
1
2
3
4
5


We can also convert a Series to a list with to_list

In [33]:
# pl.Series(name='vals', values=value)
pl.Series(name='vals', values=value).to_list()


[1, 2, 3, 4, 5]

In [35]:
# Checking the type of the list created from the Series
type(pl.Series(name='vals', values=value).to_list())

list