In [1]:
import numpy as np
import pandas as pd
import datetime

## DataSeries

In [2]:
my_readings = [3.12, 3.54, 3.24, 3.67, 3.56, 3.87]

We can create a DataSeries using any list or np.array,

In [3]:
ds_readings = pd.Series(my_readings)
ds_readings

Unnamed: 0,0
0,3.12
1,3.54
2,3.24
3,3.67
4,3.56
5,3.87


In [4]:
## Output Index
ds_readings.index

RangeIndex(start=0, stop=6, step=1)

In [5]:
## Output Value
ds_readings.values

array([3.12, 3.54, 3.24, 3.67, 3.56, 3.87])

In [6]:
ds_readings[0]

3.12

In [7]:
ds_readings[4]

3.56

In [8]:
ds_readings[1:4]

Unnamed: 0,0
1,3.54
2,3.24
3,3.67


**Masking**

In [9]:
## Note list input
ds_readings[[True, False, True, False, False, False]]

Unnamed: 0,0
0,3.12
2,3.24


In [10]:
ds_readings + 10

Unnamed: 0,0
0,13.12
1,13.54
2,13.24
3,13.67
4,13.56
5,13.87


In [11]:
ds_readings > 3.5

Unnamed: 0,0
0,False
1,True
2,False
3,True
4,True
5,True


### Q: Why might the above be useful?

Output would be very useful when used as a mask for another series

In [12]:
ds_readings[ds_readings > 3.5]

Unnamed: 0,0
1,3.54
3,3.67
4,3.56
5,3.87


We can always combine filtering and operators

In [13]:
ds_readings[ds_readings > 3.5] * 2

Unnamed: 0,0
1,7.08
3,7.34
4,7.12
5,7.74


And to change the values, either all, or partial is easy,

In [14]:
ds_readings[ds_readings > 3.5] = 0

In [15]:
ds_readings

Unnamed: 0,0
0,3.12
1,0.0
2,3.24
3,0.0
4,0.0
5,0.0


## Exercise
   * ### Create a new Dataseries on a topic of your choosing(numeric, length = 8)
   * ### Output the 2nd, last, and last two elements
   * ### Subtract a number from all elements
   * ### Generate and apply a mask
   * ### Use the mask to set values to 0

### How do I modify index?

In [16]:
timings = [datetime.time(1, 3, increment) for increment in range(6)]

In [17]:
timings

[datetime.time(1, 3),
 datetime.time(1, 3, 1),
 datetime.time(1, 3, 2),
 datetime.time(1, 3, 3),
 datetime.time(1, 3, 4),
 datetime.time(1, 3, 5)]

In [18]:
ds_readings.index = timings

In [19]:
ds_readings

Unnamed: 0,0
01:03:00,3.12
01:03:01,0.0
01:03:02,3.24
01:03:03,0.0
01:03:04,0.0
01:03:05,0.0


The `.value_counts` function finds all the unique values in the series and gives the number of ocurrences of the same number in the series,

In [20]:
ds_readings.value_counts()

Unnamed: 0,count
0.0,4
3.12,1
3.24,1


we can also sort, ascending and descending

In [21]:
ds_readings.sort_values()

Unnamed: 0,0
01:03:01,0.0
01:03:03,0.0
01:03:04,0.0
01:03:05,0.0
01:03:00,3.12
01:03:02,3.24


In [22]:
ds_readings.sort_values(ascending = False)

Unnamed: 0,0
01:03:02,3.24
01:03:00,3.12
01:03:01,0.0
01:03:03,0.0
01:03:04,0.0
01:03:05,0.0


### np.nan

`np.nan` refers to a value that should be but do not exist. And pandas provides an easiy function to check emptiness

We first add np.nan into the seies

In [23]:
ds_readings[datetime.time(0,3,6)] =  np.nan

In [24]:
ds_readings

Unnamed: 0,0
01:03:00,3.12
01:03:01,0.0
01:03:02,3.24
01:03:03,0.0
01:03:04,0.0
01:03:05,0.0
00:03:06,


In [25]:
ds_readings[datetime.time(0,3,7)] =  3.48

In [26]:
ds_readings

Unnamed: 0,0
01:03:00,3.12
01:03:01,0.0
01:03:02,3.24
01:03:03,0.0
01:03:04,0.0
01:03:05,0.0
00:03:06,
00:03:07,3.48


And pandas provides method for checking emptiness

In [27]:
ds_readings.isna()

Unnamed: 0,0
01:03:00,False
01:03:01,False
01:03:02,False
01:03:03,False
01:03:04,False
01:03:05,False
00:03:06,True
00:03:07,False


In [28]:
ds_readings[ds_readings.isna()]

Unnamed: 0,0
00:03:06,


In [29]:
result = ds_readings[ds_readings.isna()]

In [30]:
result.index

Index([00:03:06], dtype='object')

To remove items, use drop. You need to refer to the index value.

In [31]:
ds_readings.drop(result.index)

Unnamed: 0,0
01:03:00,3.12
01:03:01,0.0
01:03:02,3.24
01:03:03,0.0
01:03:04,0.0
01:03:05,0.0
00:03:07,3.48


In [32]:
ds_readings.isna()

Unnamed: 0,0
01:03:00,False
01:03:01,False
01:03:02,False
01:03:03,False
01:03:04,False
01:03:05,False
00:03:06,True
00:03:07,False


There are reduction methods

In [33]:
ds_readings.isna().any()

True

In [34]:
ds_readings.isna().all()

False

In [35]:
ds_readings.isna().sum()

1

In [36]:
ds_readings.unique()

array([3.12, 0.  , 3.24,  nan, 3.48])

### Mappings

In [37]:
mapping = {0: 10.0, np.nan: 0.0}
ds_readings.replace(mapping)

Unnamed: 0,0
01:03:00,3.12
01:03:01,10.0
01:03:02,3.24
01:03:03,10.0
01:03:04,10.0
01:03:05,10.0
00:03:06,0.0
00:03:07,3.48


In [38]:
def myround(x):
    return round(x, 1)

In [39]:
ds_readings.map(myround)

Unnamed: 0,0
01:03:00,3.1
01:03:01,0.0
01:03:02,3.2
01:03:03,0.0
01:03:04,0.0
01:03:05,0.0
00:03:06,
00:03:07,3.5


In [40]:
ds_readings.map(lambda x: round(x,1))

Unnamed: 0,0
01:03:00,3.1
01:03:01,0.0
01:03:02,3.2
01:03:03,0.0
01:03:04,0.0
01:03:05,0.0
00:03:06,
00:03:07,3.5
