In [1]:
import pandas as pd
import numpy as np


In [2]:
series = {
    'index':[0,1,2,3],
    'data':[145,142,38,13],
    'name':'songs'
}

def get(series, idx):
    value_idx = series['index'].index(idx)
    return series['data'][value_idx]


In [3]:
get(series, 1)

142

## <font color=yellow> Example of a series: songs2 = pd.Series([145,142, 38,13], name='counts')</font>

        songs2 
            index : Int64 in this case
            value : value of the index position
            name  : name of the series


In [7]:
songs2 = pd.Series([145,142, 38,13], 
    name='counts')


In [12]:
songs2

0    145
1    142
2     38
3     13
Name: counts, dtype: int64

In [13]:
songs3 = pd.Series([5, 6])

In [15]:
songs3

0    5
1    6
dtype: int64

In [39]:
songs3 = pd.Series((145,142,38,13),
            name='counts',
            index=['Paul','John', 'George', 'Ringo'])

In [29]:
songs3.index

Index(['Paul', 'John', 'George', 'Ringo'], dtype='object')

In [27]:
songs3

Paul      145
John      142
George     38
Rindo      13
Name: counts, dtype: int64

In [30]:
type(songs3)

pandas.core.series.Series

In [None]:
songs2.index

In [31]:
class Foo():
    pass

ringo = pd.Series(['Richard', 'Starkey', 13, Foo()],
    name='ringo')

In [5]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


In [32]:
ringo

0                                        Richard
1                                        Starkey
2                                             13
3    <__main__.Foo object at 0x00000201225C1340>
Name: ringo, dtype: object

In [34]:
nan_series = pd.Series([2, np.nan],
        name='test',
        index=['ono','Clapton'])

In [20]:
nan_series

ono        2.0
Clapton    NaN
Name: test, dtype: float64

In [35]:
nan_series.count()

1

In [36]:
nan_series.size

2

In [38]:
songs3[1]

  songs3[1]


142

In [40]:
songs3['Paul']

145

In [45]:
songs3.median()

90.0

In [42]:
mask = songs3 > songs3.median()

In [43]:
mask

Paul       True
John       True
George    False
Ringo     False
Name: counts, dtype: bool

In [44]:
songs3[mask]

Paul    145
John    142
Name: counts, dtype: int64

In [46]:
s = pd.Series(['m', 'l','xs','s','xl'], dtype='category')

t= pd.Series()

In [48]:
s

0     m
1     l
2    xs
3     s
4    xl
dtype: category
Categories (5, object): ['l', 'm', 's', 'xl', 'xs']

In [47]:
t

Series([], dtype: object)

### <font color=yellow>The code involves creating a pandas Series and then converting it to a categorical data type with specific ordering. Let's break down each part of the code:</font>

##### Summary

1. Creating a pandas Series (s2)
    * This line creates a pandas Series s2 containing a list of size labels. A Series is a one-dimensional array-like object capable of holding any data type. In this case, it's holding strings representing clothing sizes: 'm', 'l', 'xs', 's', and 'xl'.
1. Defining a Categorical Data Type (size_type):
    * Here, you are defining a categorical data type named size_type. This data type is specifically for the categories 's', 'm', and 'l'. The ordered=True parameter indicates that this categorical data type has a logical order. In this context, it implies that there is a meaningful order to the sizes (small < medium < large).
1. Converting Series to Categorical Type (s3):
    * In this line, you are converting the original Series s2 into the categorical data type defined earlier (size_type). The astype function is used for such type conversions. 
    The resulting Series s3 will have its data aligned with the defined categories ('s', 'm', 'l') and will respect the specified order.
    * It's important to note that since the original s2 Series contains values ('xs', 'xl') that are not defined in the size_type categories, these values will be treated as NaN (Not a Number) or missing values in the resulting s3 Series. This is because pandas categoricals can only contain values that are explicitly defined in the categories list.

**<font color=red>In summary, this code snippet demonstrates how to create a pandas Series, define a categorical data type with an inherent order, and convert a Series to this categorical type, handling values that do not fit into the defined categories.</font>**


In [51]:
s2 = pd.Series(['m','l','xs','s','xl'])

size_type= pd.api.types.CategoricalDtype(
        categories=['s','m','l'], ordered=True)

s3 = s2.astype(size_type)

size_type

CategoricalDtype(categories=['s', 'm', 'l'], ordered=True, categories_dtype=object)

In [52]:
s3

0      m
1      l
2    NaN
3      s
4    NaN
dtype: category
Categories (3, object): ['s' < 'm' < 'l']

#### **<font color=yellow>The s3 series when searching the values for all objects in s <font color=red>that are greater than 's' returns a boolean array</font></font>**

In [53]:
s3 > 's'

0     True
1     True
2    False
3    False
4    False
dtype: bool

In [None]:
s3

In [None]:
s = s.cat.reorder_categories(['xs','s','m','l','xl'],
                          ordered=True)

In [None]:
s

In [None]:
# series of temperatures
t = pd.Series([4,89,43, 55, 44])

In [None]:
t[t > t.mean()]

In [None]:
c = pd.Series(['red', 'blue', 'green'],
               dtype='category' )

In [None]:
c