Source: https://pandas.pydata.org/docs/user_guide/10min.html

Source: https://github.com/hugoestradas/Pandas_101/blob/master/Pandas%20101%20-%20pt.i_%20Intro.%20Data%20Structures.ipynb

Pandas - Python Data Analysis Library

In [21]:
import pandas as pd
from pandas import Series
import numpy as np

## Object creation

In [11]:
series = pd.Series(['a', 'b', 'c'])
series

0    a
1    b
2    c
dtype: object

In [12]:
# Yeniden indekslenebilir
series2 = pd.Series(['a', 'b', 'c'], index=[6,7,8])
print(series2)

6    a
7    b
8    c
dtype: object


In [13]:
series2[7]

'b'

In [14]:
#
series3 = pd.Series([10,-20,30,40,-50], index=['a','b','c','d','e'])
series3[series3 > 0]

a    10
c    30
d    40
dtype: int64

In [15]:
series3[series3 < 0]

b   -20
e   -50
dtype: int64

In [16]:
series4 = {'iPhone SE': 400, 'iPhone 11': 700, 'iPhone 11 Pro': 1000}
series4 = pd.Series(series4)
series4

iPhone SE         400
iPhone 11         700
iPhone 11 Pro    1000
dtype: int64

In [17]:
series4['iPhone SE']

400

In [18]:
series4['iPhone SE':'iPhone 11']

iPhone SE    400
iPhone 11    700
dtype: int64

In [19]:
dates = pd.date_range("20130101", periods=6)
dates

DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
               '2013-01-05', '2013-01-06'],
              dtype='datetime64[ns]', freq='D')

In [22]:
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list("ABCD"))
df

Unnamed: 0,A,B,C,D
2013-01-01,-1.242534,-0.21389,1.558853,-0.8166
2013-01-02,-0.314129,0.402011,-0.393254,-0.149654
2013-01-03,-0.741575,-1.008213,-0.032117,1.561699
2013-01-04,-0.646989,0.370501,1.130763,-1.521343
2013-01-05,-1.852969,-0.18323,0.55689,1.393243
2013-01-06,-0.801731,0.412041,0.79657,-0.014598


In [23]:
df.dtypes

A    float64
B    float64
C    float64
D    float64
dtype: object

In [30]:
df2 = pd.DataFrame(
    {
        "A": 1.0,
        "B": pd.Timestamp("20130102"),
        "C": pd.Series(1, index=list(range(4)), dtype="float32"),
        "D": np.array([3] * 4, dtype="int32"),
        "E": pd.Categorical(["test", "train", "test", "train"]),
        "F": "foo",
    }
)
df2

Unnamed: 0,A,B,C,D,E,F
0,1.0,2013-01-02,1.0,3,test,foo
1,1.0,2013-01-02,1.0,3,train,foo
2,1.0,2013-01-02,1.0,3,test,foo
3,1.0,2013-01-02,1.0,3,train,foo


In [31]:
df2.dtypes

A           float64
B    datetime64[ns]
C           float32
D             int32
E          category
F            object
dtype: object

## Viewing data

In [24]:
df.head()

Unnamed: 0,A,B,C,D
2013-01-01,-1.242534,-0.21389,1.558853,-0.8166
2013-01-02,-0.314129,0.402011,-0.393254,-0.149654
2013-01-03,-0.741575,-1.008213,-0.032117,1.561699
2013-01-04,-0.646989,0.370501,1.130763,-1.521343
2013-01-05,-1.852969,-0.18323,0.55689,1.393243


In [25]:
df.tail(2)

Unnamed: 0,A,B,C,D
2013-01-05,-1.852969,-0.18323,0.55689,1.393243
2013-01-06,-0.801731,0.412041,0.79657,-0.014598


In [26]:
df.index

DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
               '2013-01-05', '2013-01-06'],
              dtype='datetime64[ns]', freq='D')

In [27]:
df.columns

Index(['A', 'B', 'C', 'D'], dtype='object')

`DataFrame.to_numpy()` gives a NumPy representation of the underlying data. Note that this can be an expensive operation when your `DataFrame` has columns with different data types, which comes down to a fundamental difference between Pandas and NumPy: `NumPy arrays have one dtype for the entire array, while Pandas DataFrames have one dtype per column.`

In [28]:
df.to_numpy()

array([[-1.24253447, -0.21389045,  1.55885268, -0.81659978],
       [-0.31412867,  0.40201099, -0.39325368, -0.14965362],
       [-0.74157456, -1.00821301, -0.03211745,  1.56169858],
       [-0.64698886,  0.37050096,  1.13076283, -1.52134298],
       [-1.85296879, -0.18323019,  0.55688959,  1.39324308],
       [-0.80173144,  0.41204075,  0.79656968, -0.01459802]])