# 10 minutes to pandas

This is a short introduction to pandas, geared mainly for new users. 

You can see more complex recipes in the Cookbook.

Customarily, we import as follows:

In [1]:
import numpy as np

import pandas as pd

# Object creation

See the Data Structure Intro section.

Creating a Series by passing a list of values, letting pandas create a default integer index:

In [2]:
s = pd.Series([1, 3, 5, np.nan, 6, 8])

s

0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
dtype: float64

Creating a DataFrame by passing a NumPy array, with a datetime index and labeled columns:

In [3]:
dates = pd.date_range("20130101", periods=6)

dates

DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
               '2013-01-05', '2013-01-06'],
              dtype='datetime64[ns]', freq='D')

In [4]:
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list("ABCD"))

df

Unnamed: 0,A,B,C,D
2013-01-01,1.098577,-0.037728,-1.228922,0.568379
2013-01-02,-0.261217,-0.790637,0.657532,0.122455
2013-01-03,-0.423491,-0.212405,-0.074347,0.678173
2013-01-04,0.039766,1.103698,0.204133,-0.491073
2013-01-05,1.759463,0.762079,2.385955,1.826336
2013-01-06,-0.510878,2.126516,-1.218763,-0.548512


Creating a DataFrame by passing a dict of objects that can be converted to series-like.

In [5]:
df2 = pd.DataFrame(
    {
        "A": 1.0,
        "B": pd.Timestamp("20130102"),
        "C": pd.Series(1, index=list(range(4)), dtype="float32"),
        "D": np.array([3] * 4, dtype="int32"),
        "E": pd.Categorical(["test", "train", "test", "train"]),
        "F": "foo",
    }
)


df2

Unnamed: 0,A,B,C,D,E,F
0,1.0,2013-01-02,1.0,3,test,foo
1,1.0,2013-01-02,1.0,3,train,foo
2,1.0,2013-01-02,1.0,3,test,foo
3,1.0,2013-01-02,1.0,3,train,foo


In [6]:
# The columns of the resulting DataFrame have different dtypes.
df2.dtypes

A           float64
B    datetime64[ns]
C           float32
D             int32
E          category
F            object
dtype: object

In [7]:
# If you’re using IPython, tab completion for column names (as well as public attributes) is automatically enabled. 
# Here’s a subset of the attributes that will be completed:
df2.<TAB>  # noqa: E225, E999

SyntaxError: invalid syntax (temp/ipykernel_5152/3516304508.py, line 3)