In [5]:
import pandas as pd 

## Pandas Series Object


In [9]:
songs2 = pd.Series([145,142,38,13], name = "counts")
songs2

0    145
1    142
2     38
3     13
Name: counts, dtype: int64

```
The generic name for an index is an axis, and the values of the index : 0, 1, 2, 3 are  called axis labels. 
The data—145, 142, 38, and 13 is also called the values of the series. 

The two-dimensional structure in pandas, a DataFrame has two axes, one for the rows and another for the columns.

The index can be string-based as well, in which case pandas indicates that the datatype for the
index is object (not string):
```

### index can be string 

**dtype : object**

In [14]:
songs3 = pd . Series ([145 , 142 , 38 , 13] ,
name =' counts ' ,
index =[ ' Paul ', ' John ', ' George ', ' Ringo '])

print(songs3,"\n")
print(songs3.index)

 Paul       145
 John       142
 George      38
 Ringo       13
Name:  counts , dtype: int64 

Index([' Paul ', ' John ', ' George ', ' Ringo '], dtype='object')


### Datatypes
```
The actual data (or values) for a series does not have to be numeric or homogeneous. We can insert Python
objects into a series

The object data type is also used for a series with string values. In addition, it is also used
for values that have heterogeneous or mixed types. If you have just numeric data in a series, you
wouldn’t want it stored as a Python object, but rather as an int64 or float64 , which allow you to do
vectorized numeric operations.

If you have time data and it says it has the object type, you probably have strings for the dates.
Using strings instead of date types is bad as you don’t get the date operations that you would get
if the type were datetime64[ns] . A series with string data, on the other hand, has the type of object .


NaN : This value stands for Not A Number and is usually ignored in arithmetic operations.
(Similar to NULL in SQL). Also, float64 supports NaN , which int64 does not. When pandas sees numeric data as well as the np.nan , it coerces the integers to a ﬂoat value.

If you load data from a CSV ﬁle, an empty value for an otherwise numeric column will become
NaN . Later, methods such as .fillna and .dropna will explain how to deal with NaN .

None , NaN , nan , <NA> , and null are synonyms in this book when referring to empty or missing data
found in a pandas series or dataframe.

The int64 type does not support missing data. Many considered that a wart of pandas. As of
pandas 0.24, there is optional support for another integer type that can hold missing values denoted
as <NA> below. The documentation calls this type the nullable integer type. When you create a series,
you can pass in dtype='Int64' (note the capitalization):

You can use the .astype method to convert columns to the nullable integer type. Just use the
string 'Int64' as the type:
>>> nan_series . astype ( ' Int64 ')
```    

### Series Object
```
The Series object behaves similarly to a NumPy array. They both have methods in common.

They also both have a notion of a boolean array. A boolean array is a series with the same index
as the series you are working with that has boolean values, and it can be used as a mask to ﬁlter
out items.

Once we have a mask, we can use that as a ﬁlter

```

In [22]:
mask = songs3 > songs3.median ()          # boolean array

print(mask,"\n")
print(songs3 [ mask ])

 Paul        True
 John        True
 George     False
 Ringo      False
Name:  counts , dtype: bool 

 Paul     145
 John     142
Name:  counts , dtype: int64


### Categorical Data

```
Categories are not limited to strings; we can also convert numbers or datetime values to categorical
data.

To create a category, we pass dtype="category" into the Series constructor. Alternatively, we can
call the .astype("category") method on a series:

By default, categories don’t have an ordering. We can verify this by inspecting the .cat attribute that
has various properties:
```

In [42]:
s = pd.Series ([ 'm', 'l', 'xs', 's', 'xl'] , dtype ='category')
s

0     m
1     l
2    xs
3     s
4    xl
dtype: category
Categories (5, object): ['l', 'm', 's', 'xl', 'xs']

In [43]:
s . cat . ordered

False

### Convert to ordered category
```
To convert a non-categorical series to an ordered category, we can create a type with the CategoricalDtype constructor and the appropriate parameters. Then we pass this type into the .astype method

If we have ordered categories, we can do comparisons on them.

The prior example created a new Series from existing data that was not categorical. We can also
add ordering information to categorical data. We just need to make sure that we specify all of the
members of the category or pandas will throw a ValueError :
    
```

In [44]:
s2 = pd.Series ([ 'm ' , 'l' , 'xs', 's', 'xl'])
size_type = pd.api.types.CategoricalDtype (categories =[ 's', 'm', 'l'] , ordered = True )
s3 = s2.astype (size_type)

In [45]:
s3 > 's'

0    False
1     True
2    False
3    False
4    False
dtype: bool

In [46]:
s.cat.reorder_categories ([ 'xs','s','m','l', 'xl'] , ordered = True )

0     m
1     l
2    xs
3     s
4    xl
dtype: category
Categories (5, object): ['xs' < 's' < 'm' < 'l' < 'xl']

```
String and datetime series have a str and dt attribute that allow us to perform common
operations speciﬁc to that type. If we convert these types to categorical types, we can still
use the str or dt attributes on them:
```    

In [48]:
s3.str.upper ()

0    NaN
1      L
2    NaN
3      S
4    NaN
dtype: object

In [54]:
path = "vehicles.csv"
df = pd.read_csv(path)

df.city08

  df = pd.read_csv(path)


0        19
1         9
2        23
3        10
4        17
         ..
41139    19
41140    20
41141    18
41142    18
41143    16
Name: city08, Length: 41144, dtype: int64

In [55]:
highway_mpg = df . highway08
highway_mpg

0        25
1        14
2        33
3        12
4        23
         ..
41139    26
41140    28
41141    24
41142    24
41143    21
Name: highway08, Length: 41144, dtype: int64

```
It looks like each series has around 40,000 integer entries. Because the type of this series is int64 ,
we know that none of the values are missing.
```