# Creating Data - Nominal Data

We will create a simple dataset consisting of frequencies of occurences of a blood group. The organization of the data differs from the previous example, since we separated the frequencies in two 'categories', the sex or gender of a person. We also explicitely generate an index.

In [14]:
import pandas as pd

## Create and inspect your data

In [15]:
sex = ('M', 'F')

data = pd.DataFrame([[30, 10, 5, 40],
                     [39, 7, 2, 22]],
                    columns=('A', 'B', 'AB ', '0'),
                    index=sex)

In [16]:
print(data)

    A   B  AB    0
M  30  10    5  40
F  39   7    2  22


In [17]:
data.head(1000)

Unnamed: 0,A,B,AB,0
M,30,10,5,40
F,39,7,2,22


In [18]:
data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 2 entries, M to F
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   A       2 non-null      int64
 1   B       2 non-null      int64
 2   AB      2 non-null      int64
 3   0       2 non-null      int64
dtypes: int64(4)
memory usage: 80.0+ bytes


In [19]:
data.index

Index(['M', 'F'], dtype='object')

In [20]:
data.columns

Index(['A', 'B', 'AB ', '0'], dtype='object')

## Access your data

In [21]:
# Access a column
data.A

M    30
F    39
Name: A, dtype: int64

In [22]:
# Access by index
data.loc['M']

A      30
B      10
AB      5
0      40
Name: M, dtype: int64

In [23]:
# Access a particular column by index
data.A.loc['M']

30

In [24]:
# Access your data as a numpy-array (internal representation)
data.values

array([[30, 10,  5, 40],
       [39,  7,  2, 22]], dtype=int64)

In [25]:
# Access a column slice
data.values[0:2, 1]

array([10,  7], dtype=int64)

In [26]:
# Access a row slice
data.values[0, 1:3]

array([10,  5], dtype=int64)