### Querying Series

In [30]:
import pandas as pd
import numpy as np
students_classes = {
    'Alice': 'Physics',
    'Codeine': 'Chemistry',
    'Molly': 'English',
    'Kate': 'History'
}
s = pd.Series(students_classes)
s


Alice        Physics
Codeine    Chemistry
Molly        English
Kate         History
dtype: object

In [31]:
# To querying by numeric location use .iloc
s.iloc[2]
# Notice that the result given is from dict object values.


'English'

In [32]:
# And here using labels you can get any record like this, using .loc
s.loc['Molly']


'English'

Pandas tries to make our code a bit more readable and provides a sort of smart syntax using the indexing operator directly on the series itself. 

For instance, if you pass in an integer parameter, the operator will behave as it you want to query via the **.iloc attribute**.

In [33]:
print(s.iloc[2])  # .loc is not a method, but an attribute
print(s[2])  # and pandas guesses that you want to pass 2 as index
# Similary with .loc
print(s.loc['Kate'])
print(s['Kate'])


English
English
History
History


**But be careful with that, just use .loc and .iloc to avoid unexpected behaviours.**

---

In [34]:
# Now let's compute an averrage
# You can compute it by several ways. One of those is an iterative way.
numbers = pd.Series(np.arange(0, 1000, 1))

total = 0
for value in numbers:
    total += value
print(total / len(numbers))


499.5


In [35]:
# Now use numpy to take advantage of parallel programming
# and make cleanest code
np.sum(numbers) / len(numbers)


499.5

In [37]:
# Here is another example
# Let's sum 2 to all values in numbers Series
for label, value in numbers.iteritems():
    numbers.at[label] = value + 2
numbers.head(5)


0    4
1    5
2    6
3    7
4    8
dtype: int64

In [39]:
# But here Series can perform such operation in parallel like this
numbers -= 2
numbers


0        0
1        1
2        2
3        3
4        4
      ... 
995    995
996    996
997    997
998    998
999    999
Length: 1000, dtype: int64

---

In [41]:
# Finally, we are going to check an example where indexes are not unique
students = {
    'Alice': 'Physics',
    'Codeine': 'Chemistry',
    'Molly': 'English'
}
s = pd.Series(students)
s


Alice        Physics
Codeine    Chemistry
Molly        English
dtype: object

In [43]:
kelly_classes = pd.Series(['Philosophy', 'Arts', 'Math'], index=['Kelly']*3)
kelly_classes


Kelly    Philosophy
Kelly          Arts
Kelly          Math
dtype: object

In [48]:
# And now append kelly's array to students_classes' array
all_students_classes = s.append(kelly_classes)
# Original Series object was not affected with .append method
print(s)
print("\n")
print(all_students_classes)
print("\n")
# Accessing with .loc we found that for repeated indexes we got a Series object
print(all_students_classes.loc["Kelly"])

Alice        Physics
Codeine    Chemistry
Molly        English
dtype: object


Alice         Physics
Codeine     Chemistry
Molly         English
Kelly      Philosophy
Kelly            Arts
Kelly            Math
dtype: object


Kelly    Philosophy
Kelly          Arts
Kelly          Math
dtype: object
