# Pandas Series

```python
class pandas.Series(data=None, index=None, dtype=None, name=None, copy=None, fastpath=False)
```

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and
provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been
overridden to automatically exclude missing data (currently represented as NaN).
Operations between Series `(+, -, /, *, **)` align values based on their associated index values– they need not be the same length.

The result index will be the sorted union of the two indexes.

In [2]:
import pandas as pd
import numpy as np

### 1. creating a series

**Creating series with list**

In [3]:
# creating a sereies by passing a list to data parameters
_list = list(range(10,100, 10))
ps = pd.Series(data=_list)
ps

0    10
1    20
2    30
3    40
4    50
5    60
6    70
7    80
8    90
dtype: int64

**Creating Series with dictionary**

In [4]:
_dict = {"John": 25, "Jane": 30, "Bob": 35, "Alice": 40, "Charlie": 45, 
         "David": 50, "Eve": 55, "Frank": 60, "Grace": 65, "Henry": 70}

ps = pd.Series(data=_dict)
ps

John       25
Jane       30
Bob        35
Alice      40
Charlie    45
David      50
Eve        55
Frank      60
Grace      65
Henry      70
dtype: int64

Note: in this we are haing index values as the key of the dict, which is provided to data parameters. with those names as index values we can access series values by using their index named value.

Note: if we provede additional values to the index parameters then it will override the index values.

**Creating series with tuple**\
`Note:` we can only create a series with only array-like object, dicttionary, iterable object or a scalar value. so we can also create a series by passing the tuple.

In [5]:
_tuple = (("John", 25), ("Jane", 30), ("Bob", 35), ("Alice", 40), ("Charlie", 45), ("David", 50), 
         ("Eve", 55), ("Frank", 60), ("Grace", 65), ("Henry", 70))
ps = pd.Series(data=_tuple)

In [6]:
ps

0       (John, 25)
1       (Jane, 30)
2        (Bob, 35)
3      (Alice, 40)
4    (Charlie, 45)
5      (David, 50)
6        (Eve, 55)
7      (Frank, 60)
8      (Grace, 65)
9      (Henry, 70)
dtype: object

`Note:` In this result values ar the pair of the `name` and the `age` together which can be useful some time but it would be better if all the names treated as the index values. For that we need to convert this tuple in the dict object.

In [7]:
_tuple = (("John", 25), ("Jane", 30), ("Bob", 35), ("Alice", 40), ("Charlie", 45), ("David", 50), 
         ("Eve", 55), ("Frank", 60), ("Grace", 65), ("Henry", 70))
ps = pd.Series(data=dict(_tuple))
ps

John       25
Jane       30
Bob        35
Alice      40
Charlie    45
David      50
Eve        55
Frank      60
Grace      65
Henry      70
dtype: int64

Now as we can see that, our result is more meaningful than the above result.

# Series's mostly used attributes

* array: The ExtensionArray of the data backing this Series or Index.

**Creating a series for testing our dataset**

In [8]:
_rs = np.random.randint(50, size=50)
ps = pd.Series(_rs)
for i in range(10):
    ps[np.random.randint(0,49,)] = np.nan 
ps

0      NaN
1     15.0
2      3.0
3      2.0
4      9.0
5     22.0
6     49.0
7     24.0
8      NaN
9      NaN
10     5.0
11     NaN
12     NaN
13    18.0
14    12.0
15    12.0
16    13.0
17     7.0
18    23.0
19    10.0
20    46.0
21    43.0
22    49.0
23    11.0
24     NaN
25     1.0
26    24.0
27    38.0
28    33.0
29    28.0
30    35.0
31    36.0
32     NaN
33    36.0
34     NaN
35    25.0
36     NaN
37    27.0
38    49.0
39    40.0
40     7.0
41    39.0
42    14.0
43    29.0
44    15.0
45    49.0
46    18.0
47    20.0
48     NaN
49    31.0
dtype: float64

Now we have a randomly generated series with having some `NaN` not a number values. Now we will work on the some Pandas Series attributes.

`Note:` here **NaN**(not a number) is the standard missing data marker used in pandas. 

#### 1. array - attribute
The ExtensionArray of the data backing this Series or Index.

Pandas Series objects have a .array attribute that can be used to access the underlying array of data for a Series object. This can be useful for working with data in a Series, but can also lead to problems if the underlying array contains different data types. To avoid this, it is important to ensure that the data types in the underlying array are all the same. Additionally, the .array attribute can be used to create a new Series object from an existing array.

In [9]:
# getting the array form the series.
ps.array

<PandasArray>
[ nan, 15.0,  3.0,  2.0,  9.0, 22.0, 49.0, 24.0,  nan,  nan,  5.0,  nan,  nan,
 18.0, 12.0, 12.0, 13.0,  7.0, 23.0, 10.0, 46.0, 43.0, 49.0, 11.0,  nan,  1.0,
 24.0, 38.0, 33.0, 28.0, 35.0, 36.0,  nan, 36.0,  nan, 25.0,  nan, 27.0, 49.0,
 40.0,  7.0, 39.0, 14.0, 29.0, 15.0, 49.0, 18.0, 20.0,  nan, 31.0]
Length: 50, dtype: float64

Note: if we want to work on the pure numpy array then we need to use the `Series.to_numpy()` method.

In [10]:
ps.to_numpy()

array([nan, 15.,  3.,  2.,  9., 22., 49., 24., nan, nan,  5., nan, nan,
       18., 12., 12., 13.,  7., 23., 10., 46., 43., 49., 11., nan,  1.,
       24., 38., 33., 28., 35., 36., nan, 36., nan, 25., nan, 27., 49.,
       40.,  7., 39., 14., 29., 15., 49., 18., 20., nan, 31.])

Now above result gives the numpy array from Pandas series.

___

#### 2. axes - attribute

Pandas Series.axes is an attribute that contains a list of the row and column axes labels of the Series. If the row and column axes have different labels, then the Series.axes attribute will include both labels. Problems with the Series.axes attribute can arise if the labels of the row and column axes are not properly set. To solve this issue, you can use the Series.renameaxis() method to correctly set the labels of the axes. Additionally, you can use the Series.resetindex() method to reset the index of the Series, which will also reset the labels of the axes.

In [11]:
ps.axes

[RangeIndex(start=0, stop=50, step=1)]

____

#### 3. dtype and dtypes - attributes
-> both dtype and dtypes Return the dtype object of the underlying data


In [12]:
ps.dtype

dtype('float64')

In [13]:
ps.dtypes

dtype('float64')

____

#### 4. hasnans - attribute
-> Returns true if there are any NaNs values present inside the series.

In [14]:
ps.array

<PandasArray>
[ nan, 15.0,  3.0,  2.0,  9.0, 22.0, 49.0, 24.0,  nan,  nan,  5.0,  nan,  nan,
 18.0, 12.0, 12.0, 13.0,  7.0, 23.0, 10.0, 46.0, 43.0, 49.0, 11.0,  nan,  1.0,
 24.0, 38.0, 33.0, 28.0, 35.0, 36.0,  nan, 36.0,  nan, 25.0,  nan, 27.0, 49.0,
 40.0,  7.0, 39.0, 14.0, 29.0, 15.0, 49.0, 18.0, 20.0,  nan, 31.0]
Length: 50, dtype: float64

As we can see that it contains the NaNs values, so if we want to test that our series without printing the whole series , we can use `Series.hasnans` attribute.

In [15]:
ps.hasnans

True

___

#### 5. index - attribute
-> The index (axis labels) of the Series.

In [16]:
ps.index

RangeIndex(start=0, stop=50, step=1)

In this section as we can see that our index and axes values both are RangeIndex values.

____

#### 6. name - attribute
-> It return the name of the Series.

The name of a Series becomes its index or column name if it is used to form a DataFrame. It is also used whenever displaying the Series using the interpreter.

In [17]:
# Now we will print the first 5 values
ps.head()

0     NaN
1    15.0
2     3.0
3     2.0
4     9.0
dtype: float64

As we can see that it does not contains any name, so if we use the `Series.name` method it will return None.

In [18]:
ps.name

Now we will set the name of this above series.

In [19]:
ps.name = "Random Series"

In [20]:
ps.name

'Random Series'

In [21]:
ps.head()

0     NaN
1    15.0
2     3.0
3     2.0
4     9.0
Name: Random Series, dtype: float64

Now we can see the name of this series on both return values, form `name` attribute and the `head()` method as well.

___

#### nbytes - attribute
-> Return the number of bytes in the underlying data.

In [22]:
ps.nbytes

400

As we can see that we have got 400 bytes consumed by this series. we have 50 items inside this series and each of them 8-bytes since they are `float64` dtype values.

In [23]:
np.float64?

[1;31mInit signature:[0m [0mnp[0m[1;33m.[0m[0mfloat64[0m[1;33m([0m[0mx[0m[1;33m=[0m[1;36m0[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m     
Double-precision floating-point number type, compatible with Python `float`
and C ``double``.

:Character code: ``'d'``
:Canonical name: `numpy.double`
:Alias: `numpy.float_`
:Alias on this platform (win32 AMD64): `numpy.float64`: 64-bit precision floating-point number type: sign bit, 11 bits exponent, 52 bits mantissa.
[1;31mFile:[0m           c:\users\manis\appdata\local\programs\python\python310\lib\site-packages\numpy\__init__.py
[1;31mType:[0m           type
[1;31mSubclasses:[0m     


#### 7. size - attribute
   -> Return thenumber of elements in the ynderlying data.

In [24]:
ps.size

50

As we know that our series contains 50 items as size attribute gives.

#### 8. values - attribute
 -> Return Series as ndarray or ndarray-like depending on the dtype.

In [25]:
ps.values

array([nan, 15.,  3.,  2.,  9., 22., 49., 24., nan, nan,  5., nan, nan,
       18., 12., 12., 13.,  7., 23., 10., 46., 43., 49., 11., nan,  1.,
       24., 38., 33., 28., 35., 36., nan, 36., nan, 25., nan, 27., 49.,
       40.,  7., 39., 14., 29., 15., 49., 18., 20., nan, 31.])

As per pandas documentation we should avoid to use `Series.values` attribute instead we should use `Series.array` attribute or `Series.to_numpy()` method.

___

#### 9. is_unique - attribute
Return boolean if values in the object are unique.

In [26]:
ps.is_unique

False

___

# Pandas Series Mostly used methods

#### 1. Basic Level methods

* `add():` Return addition of series and other, element wise (binary operator add).
* `sub():` Return Subtraction of series and othe, element-wise (binary operator sub).
* `mul():` Return Multiplication of series and other, element-wise (binary operator mul).
* `div():` Return Floating division of series and other, element-wise (binary operator div)
* `info():` Print a concise summary of a Series.
* `describe():` Generate descriptive statistics.
* `min():` Return the minimum of the values over the requested axis.
* `max():`  Return the maximum of the values over the requested axis.
* `mean():`  Return the mean of the values over the requested axis.
* `median():` Return the median of the values over the requested axis.
* `mode():` Return the mode(s) of the Series.
* `count():` Return number of non-NA/null observations in the series.

##### `Series.add()` method
```python
Series.add(other, level=None, fill_value=None, axis=0
```
Return Addition of series and other, element-wise (binary operator add).

Equivalent to `series + other`, but with support to substitute a fill_value for missing data in either one of the inputs.

In [27]:
# first we will create two random series then we will add them
s1 = pd.Series(data = np.random.randint(0,100, 10))
s2 = pd.Series(data = np.random.randint(0,100, 10))

adding s1 and s2 series

In [28]:
# first printing both series as array
print("Series s1: ",s1.to_numpy())
print("Series s2: ",s2.to_numpy())

# Now adding both series
s3 = s1.add(s2)
print("Series s1 + s2: ", s3.to_numpy())

Series s1:  [99 10 74 60 40 17 39 75 78 16]
Series s2:  [12 65 78 32 45 84 63 29 94 42]
Series s1 + s2:  [111  75 152  92  85 101 102 104 172  58]


Displaying both series all three series combine by using the Pandas.concat method.

In [29]:
pd.concat([s1,s2,s3], axis=1, keys=["Series s1", "Series s2", "Series s1 + s2"])

Unnamed: 0,Series s1,Series s2,Series s1 + s2
0,99,12,111
1,10,65,75
2,74,78,152
3,60,32,92
4,40,45,85
5,17,84,101
6,39,63,102
7,75,29,104
8,78,94,172
9,16,42,58


##### `Series.sub()` Method

Return Subtraction of series and other, element-wise (binary operator sub).

Equivalent to `series - other`, but with support to substitute a fill_value for missing data in either one of the inputs.

In [30]:
# first printing both series as array
print("Series s1: ",s1.to_numpy())
print("Series s2: ",s2.to_numpy())

# Now adding both series
s3 = s1.sub(s2)
print("Series s1 + s2: ", s3.to_numpy())

Series s1:  [99 10 74 60 40 17 39 75 78 16]
Series s2:  [12 65 78 32 45 84 63 29 94 42]
Series s1 + s2:  [ 87 -55  -4  28  -5 -67 -24  46 -16 -26]


In [31]:
pd.concat([s1,s2,s3], axis=1, keys=["Series s1", "Series s2", "Series s1 - s2"])

Unnamed: 0,Series s1,Series s2,Series s1 - s2
0,99,12,87
1,10,65,-55
2,74,78,-4
3,60,32,28
4,40,45,-5
5,17,84,-67
6,39,63,-24
7,75,29,46
8,78,94,-16
9,16,42,-26


##### `Series.mul()` method

Return Multiplication of series and other, element-wise (binary operator mul).

Equivalent to `series * other`, but with support to substitute a fill_value for missing data in either one of the inputs.

In [32]:
# first printing both series as array
print("Series s1: ",s1.to_numpy())
print("Series s2: ",s2.to_numpy())

# Now adding both series
s3 = s1.mul(s2)
print("Series s1 + s2: ", s3.to_numpy())

Series s1:  [99 10 74 60 40 17 39 75 78 16]
Series s2:  [12 65 78 32 45 84 63 29 94 42]
Series s1 + s2:  [1188  650 5772 1920 1800 1428 2457 2175 7332  672]


In [33]:
pd.concat([s1,s2,s3], axis=1, keys=["Series s1", "Series s2", "Series s1 * s2"])

Unnamed: 0,Series s1,Series s2,Series s1 * s2
0,99,12,1188
1,10,65,650
2,74,78,5772
3,60,32,1920
4,40,45,1800
5,17,84,1428
6,39,63,2457
7,75,29,2175
8,78,94,7332
9,16,42,672


##### `Series.div()` metod

Return Floating division of series and other, element-wise (binary operator truediv).

Equivalent to `series / other`, but with support to substitute a fill_value for missing data in either one of the inputs.

In [34]:
# first printing both series as array
print("Series s1: ",s1.to_numpy())
print("Series s2: ",s2.to_numpy())

# Now adding both series
s3 = s1.div(s2)
print("Series s1 + s2: ", s3.to_numpy())

Series s1:  [99 10 74 60 40 17 39 75 78 16]
Series s2:  [12 65 78 32 45 84 63 29 94 42]
Series s1 + s2:  [8.25       0.15384615 0.94871795 1.875      0.88888889 0.20238095
 0.61904762 2.5862069  0.82978723 0.38095238]


In [35]:
pd.concat([s1,s2,s3], axis=1, keys=["Series s1", "Series s2", "Series s1 / s2"])

Unnamed: 0,Series s1,Series s2,Series s1 / s2
0,99,12,8.25
1,10,65,0.153846
2,74,78,0.948718
3,60,32,1.875
4,40,45,0.888889
5,17,84,0.202381
6,39,63,0.619048
7,75,29,2.586207
8,78,94,0.829787
9,16,42,0.380952


_____

##### `Series.describe()` method

Generate descriptive statistics.

Descriptive statistics include those that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.

Analyzes both numeric and object series, as well as DataFrame column sets of mixed data types. The output will vary depending on what is provided. Refer to the notes below for more detail.

For numeric data, the result’s index will include count, mean, std, min, max as well as lower, 50 and upper percentiles. By default the lower percentile is 25 and the upper percentile is 75. The 50 percentile is the same as the median.

For object data (e.g. strings or timestamps), the result’s index will include count, unique, top, and freq. The top is the most common value. The freq is the most common value’s frequency. Timestamps also include the first and last items.

If multiple object values have the highest count, then the count and top results will be arbitrarily chosen from among those with the highest count.

For mixed data types provided via a DataFrame, the default is to return only an analysis of numeric columns. If the dataframe consists only of object and categorical data without any numeric columns, the default is to return an analysis of both the object and categorical columns. If include='all' is provided as an option, the result will include a union of attributes of each type.

The include and exclude parameters can be used to limit which columns in a DataFrame are analyzed for the output. The parameters are ignored when analyzing a Series.

In [36]:
s1.describe()

count    10.000000
mean     50.800000
std      30.814138
min      10.000000
25%      22.500000
50%      50.000000
75%      74.750000
max      99.000000
dtype: float64

* count : it defins the number of non-NA/null values.
* mean : mean of the all values, it does not count NA/null values.
* std : standard deviation of the values.
* min : mininum value in the series.
* max : maximum value in the series.
* 25%, 50%, 75% are percentile values for the given series.

_____

##### `Series.info()` method

Print a concise summary of a Series.

This method prints information about a Series including the index dtype, non-null values and memory usage.

In [54]:
ps.info()

<class 'pandas.core.series.Series'>
RangeIndex: 50 entries, 0 to 49
Series name: Random Series
Non-Null Count  Dtype  
--------------  -----  
40 non-null     float64
dtypes: float64(1)
memory usage: 528.0 bytes


_____

##### `Series.min()` method

Return the minimum of the values over the requested axis.

If you want the index of the minimum, use idxmin. This is the equivalent of the numpy.ndarray method argmin.

In [37]:
s1.to_numpy()

array([99, 10, 74, 60, 40, 17, 39, 75, 78, 16])

In [38]:
s1.min()

10

In [39]:
ps.to_numpy()

array([nan, 15.,  3.,  2.,  9., 22., 49., 24., nan, nan,  5., nan, nan,
       18., 12., 12., 13.,  7., 23., 10., 46., 43., 49., 11., nan,  1.,
       24., 38., 33., 28., 35., 36., nan, 36., nan, 25., nan, 27., 49.,
       40.,  7., 39., 14., 29., 15., 49., 18., 20., nan, 31.])

In [40]:
ps.min()

1.0

_____

##### `Series.max()` method

Return the maximum of the values over the requested axis.

If you want the index of the maximum, use idxmax. This is the equivalent of the numpy.ndarray method argmax.

In [41]:
ps.to_numpy()

array([nan, 15.,  3.,  2.,  9., 22., 49., 24., nan, nan,  5., nan, nan,
       18., 12., 12., 13.,  7., 23., 10., 46., 43., 49., 11., nan,  1.,
       24., 38., 33., 28., 35., 36., nan, 36., nan, 25., nan, 27., 49.,
       40.,  7., 39., 14., 29., 15., 49., 18., 20., nan, 31.])

In [42]:
ps.max()

49.0

In [43]:
print("max value in s1: ", s1.max())
print("max value in s2: ", s2.max())

max value in s1:  99
max value in s2:  94


_____

##### `Series.count()` method

-> Return number of non-NA/null observations in the Series.

In [44]:
ps.to_numpy()

array([nan, 15.,  3.,  2.,  9., 22., 49., 24., nan, nan,  5., nan, nan,
       18., 12., 12., 13.,  7., 23., 10., 46., 43., 49., 11., nan,  1.,
       24., 38., 33., 28., 35., 36., nan, 36., nan, 25., nan, 27., 49.,
       40.,  7., 39., 14., 29., 15., 49., 18., 20., nan, 31.])

In [45]:
print("length of the ps series: ", len(ps))
print("non-NA/null count in ps series: ", ps.count())

length of the ps series:  50
non-NA/null count in ps series:  40


_____

##### `Series.mean()` method

Return the mean of the values over the requested axis.

by default it will skip all the NA/null values from the computation.

In [46]:
ps.to_numpy()

array([nan, 15.,  3.,  2.,  9., 22., 49., 24., nan, nan,  5., nan, nan,
       18., 12., 12., 13.,  7., 23., 10., 46., 43., 49., 11., nan,  1.,
       24., 38., 33., 28., 35., 36., nan, 36., nan, 25., nan, 27., 49.,
       40.,  7., 39., 14., 29., 15., 49., 18., 20., nan, 31.])

In [47]:
ps.mean()

24.175

____

##### `Series.median()` method

Return the median of the values over the requested axis.

by default it will skip all NA/null values from the computation.

In [48]:
ps.to_numpy()

array([nan, 15.,  3.,  2.,  9., 22., 49., 24., nan, nan,  5., nan, nan,
       18., 12., 12., 13.,  7., 23., 10., 46., 43., 49., 11., nan,  1.,
       24., 38., 33., 28., 35., 36., nan, 36., nan, 25., nan, 27., 49.,
       40.,  7., 39., 14., 29., 15., 49., 18., 20., nan, 31.])

In [49]:
ps.median()

23.5

____

##### `Series.mode()` method

Return the mode(s) of the Series.

The mode is the value that appears most often. There can be multiple modes.

Always returns Series even if only one value is returned.

In [50]:
ps.to_numpy()

array([nan, 15.,  3.,  2.,  9., 22., 49., 24., nan, nan,  5., nan, nan,
       18., 12., 12., 13.,  7., 23., 10., 46., 43., 49., 11., nan,  1.,
       24., 38., 33., 28., 35., 36., nan, 36., nan, 25., nan, 27., 49.,
       40.,  7., 39., 14., 29., 15., 49., 18., 20., nan, 31.])

In [51]:
ps.mode()

0    49.0
Name: Random Series, dtype: float64

____

# Series method for indexing / Iteration / Selection of items
These method allow us to access an item by using index over the `Series`. we can access the item by using the index number of label name.

* `Series.at():` Access a single value for a row/column lable pair.
* `Series.iat():` Acces a single value fro a row/column pair by integer position.
* `Series.loc():` Acces group of erows and columns by labels or boolean array.
* `Series.iloc():` Purely integer-location based indexing for selection by position.

##### `Series.at()` & `Series.iat()` method

Access a single value for a row/column label pair.

Similar to `loc`, in that both provide label-based lookups. Use `at` if you only need to get or set a single value in a DataFrame or Series.

In [74]:
_dict = {"John": 25, "Jane": 30, "Bob": 35, "Alice": 40, "Charlie": 45, 
         "David": 50, "Eve": 55, "Frank": 60, "Grace": 65, "Henry": 70}

ps = pd.Series(data=_dict, name="Person", dtype=np.int8)

Now we have a Series named as Students which is having student name's as the label of index and it's values are `np.int8` types.

Note: not if we use this `at()` method on a Dataframe then to access a single value from it we need to provide column name first. but since Series itself act as a Column so we just need to pass the label/index name.

In [75]:
ps

John       25
Jane       30
Bob        35
Alice      40
Charlie    45
David      50
Eve        55
Frank      60
Grace      65
Henry      70
Name: Person, dtype: int8

In [76]:
# we can access the any value by its label name
ps.at["Bob"]

35

if we want to access the value by its index number then also we can use the `Series.iat()` method instead of using the `Series.at()` method.

In [77]:
ps.iat[2]

35

Note: if we pass a list or tuple, or in other case multiple values to access multiple values then it will raise Error.

____

##### `Series.loc()` & `Series.iloc()` method **

Access a group of rows and columns by label(s) or a boolean array.

`.loc[]` is primarily label based, but may also be used with a boolean array.

* A single label, e.g. 5 or 'a', (note that 5 is interpreted as a label of the index, and never as an integer position along the index).

* A list or array of labels, e.g. ['a', 'b', 'c'].

* A slice object with labels, e.g. 'a':'f'.

* A boolean array of the same length as the axis being sliced, e.g. [True, False, True].

* An alignable boolean Series. The index of the key will be aligned before masking.

* An alignable Index. The Index of the returned selection will be the input.

* A callable function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above)


In [78]:
ps

John       25
Jane       30
Bob        35
Alice      40
Charlie    45
David      50
Eve        55
Frank      60
Grace      65
Henry      70
Name: Person, dtype: int8

Accessing a Single value by label

In [79]:
ps.loc["John"]

25

Accessing multiple values by passing the a list of labels.

In [80]:
ps.loc[["John", "Bob"]]

John    25
Bob     35
Name: Person, dtype: int8

Accessing the multiple value by slicing the Series like a list.

In [81]:
ps.loc["Charlie": "Frank"]

Charlie    45
David      50
Eve        55
Frank      60
Name: Person, dtype: int8

Note: During Slicing the Pandas Series or Dataframe, both the start and stop of th eslice are included.

Filtering the selection with the boolean array.  we can create a boolean array by broadcasting operation on the series.

like getting all the person below the 55 age.

In [84]:
# first creating the boolean array.
_filter = ps < 40
_filter.array

<PandasArray>
[True, True, True, False, False, False, False, False, False, False]
Length: 10, dtype: bool

In [85]:
# Now accessing the element by using the _filter boolean array.
ps.loc[_filter]

John    25
Jane    30
Bob     35
Name: Person, dtype: int8

Note: `.loc[]` method can be used with both `Pandas.Series` and `Pandas.Dataframe`, we will se more of the operation of `.loc[]` while working with `Pandas.Dataframe`

_____
**`Series.iloc[]`**

Purely integer-location based indexing for selection by position.

.iloc[] is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array.

* An integer, e.g. 5.

* A list or array of integers, e.g. [4, 3, 0].

* A slice object with ints, e.g. 1:7.

* A boolean array.

* A callable function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above). This is useful in method chains, when you don’t have a reference to the calling object, but would like to base your selection on some value.

* A tuple of row and column indexes. The tuple elements consist of one of the above inputs, e.g. (0, 1).

`.iloc` will raise IndexError if a requested indexer is out-of-bounds, except slice indexers which allow out-of-bounds indexing (this conforms with python/numpy slice semantics).

Accessing a single value by passing its index number.

In [87]:
ps.iloc[4]

45

Slicing the series in given range of index.

In [88]:
ps.iloc[3:7]

Alice      40
Charlie    45
David      50
Eve        55
Name: Person, dtype: int8

_____

##### `Series.items()` method

Lazily iterate over (index, value) tuples.

This method returns an iterable tuple (index, value). This is convenient if you want to create a lazy iterator.

In [100]:
for key, value in ps.items():
    print(key,"-->",value)

John --> 25
Jane --> 30
Bob --> 35
Alice --> 40
Charlie --> 45
David --> 50
Eve --> 55
Frank --> 60
Grace --> 65
Henry --> 70


____

##### `Series.keys()` method

Return alias for index.

In [101]:
ps.keys()

Index(['John', 'Jane', 'Bob', 'Alice', 'Charlie', 'David', 'Eve', 'Frank',
       'Grace', 'Henry'],
      dtype='object')

##### `Series.pop()` method
Return item and drops fr4om series. Raise KeyError if not found.

In [105]:
ser = pd.Series([1,2,3])
ser

0    1
1    2
2    3
dtype: int64

In [106]:
ser.pop(0)

1

In [107]:
ser

1    2
2    3
dtype: int64

_____
_____

# Reindexing / Selection / Label Manipulation

Note: in this section we will work on the real dataset, we have a carSale dataset which contains many fields. but only from them we need a series. so we will get the 

In [112]:
carSale = pd.read_csv("./Carsales_dataset.csv")

In [113]:
carSale.head()

Unnamed: 0,Manufacturer,Model,Sales_in_thousands,__year_resale_value,Vehicle_type,Price_in_thousands,Engine_size,Horsepower,Wheelbase,Width,Length,Curb_weight,Fuel_capacity,Fuel_efficiency,Latest_Launch,Power_perf_factor
0,Acura,Integra,16.919,16.36,Passenger,21.5,1.8,140.0,101.2,67.3,172.4,2.639,13.2,28.0,02-02-2012,58.28015
1,Acura,TL,39.384,19.875,Passenger,28.4,3.2,225.0,108.1,70.3,192.9,3.517,17.2,25.0,06-03-2011,91.370778
2,Acura,CL,14.114,18.225,Passenger,,3.2,225.0,106.9,70.6,192.0,3.47,17.2,26.0,01-04-2012,
3,Acura,RL,8.588,29.725,Passenger,42.0,3.5,210.0,114.6,71.4,196.6,3.85,18.0,22.0,03-10-2011,91.389779
4,Audi,A4,20.397,22.255,Passenger,23.99,1.8,150.0,102.6,68.2,178.0,2.998,16.4,27.0,10-08-2011,62.777639
