# Pandas Series

<img src="DataSets/companies.png" width="450px">

Convert the previous image into a Pandas Series

In [58]:
import pandas as pd

In [59]:
companies = [
    'Apple', 'Samsung', 'Alphabet', 'Foxconn',
    'Microsoft', 'Huawei', 'Dell Technologies',
    'Meta', 'Sony', 'Hitachi', 'Intel',
    'IBM', 'Tencent', 'Panasonic'
    ]
data =[
    274515, 200734, 182527, 181945, 143015,
    129184, 92224, 85965, 84893, 82345,
    77867, 73620, 69864, 63191
    ]
s = pd.Series(data,index=companies,name="Top Technology Companies by Revenue")
s

Apple                274515
Samsung              200734
Alphabet             182527
Foxconn              181945
Microsoft            143015
Huawei               129184
Dell Technologies     92224
Meta                  85965
Sony                  84893
Hitachi               82345
Intel                 77867
IBM                   73620
Tencent               69864
Panasonic             63191
Name: Top Technology Companies by Revenue, dtype: int64

## Selection and Location

### Select by index

We can select by index by just use square brackets

In [60]:
s["Apple"]

274515

We can also use `.loc` which is the preferred way

In [61]:
s.loc["Apple"]

274515

### Select by position

We can select elements by their position. After all, Series are ordered data structures. So we can select an element by its position. To select an element by its position, we use the `.iloc` attribute. The beauty of `.iloc` is that, as selection in Python lists, it accepts negative numbers to reference elements from the end of the series. That means that .iloc[-1] returns the LAST element in the series.

In [62]:
s.iloc[0]

274515

In [63]:
s.iloc[-1]

63191

We can also make use of the `slice` python property

In [64]:
s.iloc[1:-4]

Samsung              200734
Alphabet             182527
Foxconn              181945
Microsoft            143015
Huawei               129184
Dell Technologies     92224
Meta                  85965
Sony                  84893
Hitachi               82345
Name: Top Technology Companies by Revenue, dtype: int64

### Check if an element is the Series

In [65]:
"Apple" in s

True

In [66]:
"Huawei" in s

True

But the `in` method only works with the index, not with the values

In [67]:
274515 in s

False

### Multiple Selection

In [68]:
s.loc[['Apple', 'Intel', 'Sony']]

Apple    274515
Intel     77867
Sony      84893
Name: Top Technology Companies by Revenue, dtype: int64

In [69]:
s.iloc[[0, 5, -1]]

Apple        274515
Huawei       129184
Panasonic     63191
Name: Top Technology Companies by Revenue, dtype: int64

## Activities

Select the revenue of `Intel` and store it in a variable named `intel_revenue`:

In [70]:
intel_revenue = s.loc["Intel"]
intel_revenue

77867

Select the revenue of the "second to last" element in our series `s` and store it in a variable named `second_to_last`:

In [71]:
second_to_last = s.iloc[-2]
second_to_last

69864

Use multiple label selection to retrieve the revenues of the companies:

* Samsung
* Dell Technologies
* Panasonic
* Microsoft

In [72]:
sub_series = s.loc[["Samsung", "Dell Technologies", "Panasonic", "Microsoft"]]

## Series Attributes and Methods

In [73]:
s.head()

Apple        274515
Samsung      200734
Alphabet     182527
Foxconn      181945
Microsoft    143015
Name: Top Technology Companies by Revenue, dtype: int64

In [74]:
s.tail()

Hitachi      82345
Intel        77867
IBM          73620
Tencent      69864
Panasonic    63191
Name: Top Technology Companies by Revenue, dtype: int64

### Main attributes

The underlying data

In [76]:
s.values

array([274515, 200734, 182527, 181945, 143015, 129184,  92224,  85965,
        84893,  82345,  77867,  73620,  69864,  63191], dtype=int64)

The index

In [77]:
s.index

Index(['Apple', 'Samsung', 'Alphabet', 'Foxconn', 'Microsoft', 'Huawei',
       'Dell Technologies', 'Meta', 'Sony', 'Hitachi', 'Intel', 'IBM',
       'Tencent', 'Panasonic'],
      dtype='object')

The name (if any)

In [78]:
s.name

'Top Technology Companies by Revenue'

The type associated with the values:

In [79]:
s.dtype

dtype('int64')

The size of the Series:

In [80]:
print(s.size)
print(s.shape)
len(s)

14
(14,)


14

### Statistical Merhods

In [81]:
s.describe()

count        14.000000
mean     124420.642857
std       63686.481231
min       63191.000000
25%       78986.500000
50%       89094.500000
75%      172212.500000
max      274515.000000
Name: Top Technology Companies by Revenue, dtype: float64

In [82]:
s.mean()

124420.64285714286

In [83]:
s.median()

89094.5

In [84]:
s.std()

63686.48123135607

In [85]:
s.min(), s.max()

(63191, 274515)

In [86]:
s.quantile(.75)

172212.5

In [87]:
s.quantile(.99)

264923.47

## Activities

In [88]:
american_companies = s[[
    'Meta', 'IBM', 'Microsoft',
    'Dell Technologies', 'Apple', 'Intel', 'Alphabet'
]]
american_companies

Meta                  85965
IBM                   73620
Microsoft            143015
Dell Technologies     92224
Apple                274515
Intel                 77867
Alphabet             182527
Name: Top Technology Companies by Revenue, dtype: int64

What's the average revenue of American Companies?

In [89]:
american_companies.mean()

132819.0

What's the median revenue of American Companies?

In [90]:
american_companies.median()

92224.0

## Sorting Series

Sorting by values (Ascending)

In [91]:
s.sort_values()

Panasonic             63191
Tencent               69864
IBM                   73620
Intel                 77867
Hitachi               82345
Sony                  84893
Meta                  85965
Dell Technologies     92224
Huawei               129184
Microsoft            143015
Foxconn              181945
Alphabet             182527
Samsung              200734
Apple                274515
Name: Top Technology Companies by Revenue, dtype: int64

Sorting by index (lexicographically and ascending)

In [92]:
s.sort_index()

Alphabet             182527
Apple                274515
Dell Technologies     92224
Foxconn              181945
Hitachi               82345
Huawei               129184
IBM                   73620
Intel                 77867
Meta                  85965
Microsoft            143015
Panasonic             63191
Samsung              200734
Sony                  84893
Tencent               69864
Name: Top Technology Companies by Revenue, dtype: int64

Sort in descending order

In [93]:
s.sort_values(ascending=False)

Apple                274515
Samsung              200734
Alphabet             182527
Foxconn              181945
Microsoft            143015
Huawei               129184
Dell Technologies     92224
Meta                  85965
Sony                  84893
Hitachi               82345
Intel                 77867
IBM                   73620
Tencent               69864
Panasonic             63191
Name: Top Technology Companies by Revenue, dtype: int64

In [94]:
s.sort_index(ascending=False)

Tencent               69864
Sony                  84893
Samsung              200734
Panasonic             63191
Microsoft            143015
Meta                  85965
Intel                 77867
IBM                   73620
Huawei               129184
Hitachi               82345
Foxconn              181945
Dell Technologies     92224
Apple                274515
Alphabet             182527
Name: Top Technology Companies by Revenue, dtype: int64

## Inmutabilty 

If by any chance, you DO want to mutate your series, in this case, you want to sort it and alter the underlying series (in s in this case), you must pass the inplace=True attribute. When doing so, you'll see that this time the method doesn't return anything, but the underlying series (in s has changed) to contain the data in the order required.

In [95]:
s.head()

Apple        274515
Samsung      200734
Alphabet     182527
Foxconn      181945
Microsoft    143015
Name: Top Technology Companies by Revenue, dtype: int64

In [96]:
s.sort_values(inplace=True)

In [97]:
s.head()

Panasonic    63191
Tencent      69864
IBM          73620
Intel        77867
Hitachi      82345
Name: Top Technology Companies by Revenue, dtype: int64

## Modifying Series

In [98]:
s['IBM'] = 0

In [99]:
s.sort_values().head()

IBM              0
Panasonic    63191
Tencent      69864
Intel        77867
Hitachi      82345
Name: Top Technology Companies by Revenue, dtype: int64

Adding elements

In [100]:
s["Tesla"] = 21450

In [101]:
s.sort_values().head()

IBM              0
Tesla        21450
Panasonic    63191
Tencent      69864
Intel        77867
Name: Top Technology Companies by Revenue, dtype: int64

Removing Elements

In [102]:
del s['Tesla']

In [103]:
s.sort_values().head()

IBM              0
Panasonic    63191
Tencent      69864
Intel        77867
Hitachi      82345
Name: Top Technology Companies by Revenue, dtype: int64

## Concatening Series

We can "concatenate" series to other series using the `.concat()` method s.concat([dataframe1, dataframe2, ..., dataframeN]), the method returns a new series or dataframe with the values of the two series/dataframe concatenated.

In [104]:
s2 = pd.Series([21_450, 4_120], index=['Tesla', 'Snapchat'])

In [105]:
new_s = pd.concat([s,s2])
new_s

Panasonic             63191
Tencent               69864
IBM                       0
Intel                 77867
Hitachi               82345
Sony                  84893
Meta                  85965
Dell Technologies     92224
Huawei               129184
Microsoft            143015
Foxconn              181945
Alphabet             182527
Samsung              200734
Apple                274515
Tesla                 21450
Snapchat               4120
dtype: int64