# Series

In [1]:
import pandas as pd

## Create a Series Object from a List
- A pandas **Series** is a one-dimensional labelled array.
- A Series combines the best features of a list and a dictionary.
- A Series maintains a single collection of ordered values (i.e. a single column of data).
- We can assign each value an identifier, which does not have to *be* unique.

In [3]:
ice_scream= ['chocolate', 'vanilla', 'strawbery', 'rum raisin']
pd.Series(ice_scream)

0     chocolate
1       vanilla
2     strawbery
3    rum raisin
dtype: object

- The *dtype* final row is expressing the type of the data stored in our pandas series. Although in this case we have only strings, the *object* type is the default representation pandas uses to store strings and more complex data types on our Series (Sequence)
- It is important to mention, however, that object type only applies to strings and mix of data. Other data types follow the expected behavior (integers and booleans, for example)

In [4]:
lottery_numbers= [10,64,44,32,98]
pd.Series(lottery_numbers)

0    10
1    64
2    44
3    32
4    98
dtype: int64

In [6]:
registrations= [True, False, False, True, True]
pd.Series(registrations)

0     True
1    False
2    False
3     True
4     True
dtype: bool

- In general, pandas series uses identifiers to correlate every value we put on it, but when we don't specify those identifiers, it automatically places integer numbers starting from 0 (as well as list objects)

## Create a Series Object from a Dictionary

In [9]:
sushi= {
    'Salmon': 'Orange',
    'Tuna': 'Red',
    'Eel': 'Brown'
}

pd.Series(sushi)

Salmon    Orange
Tuna         Red
Eel        Brown
dtype: object

- It is combining both features from lists and dictitionaries: the property of keeping values order from the list and the property of associating key-value correlations from a dictitionary. In other words, in this case the series knows both informations at the same time: the keys correlation and the order of the stored values 

## Intro to Series Methods
- The syntax to invoke a method on any object is `object.method()`.
- The `sum` method adds together the **Series'** values.
- The `product` method multiplies the **Series'** values.
- The `mean` method finds the average of the **Series'** values.
- The `std` method finds the standard deviation of the **Series'** values.

In [10]:
prices= pd.Series([2.44, 4.00, 8.92])
prices

0    2.44
1    4.00
2    8.92
dtype: float64

In [11]:
prices.sum()

15.36

In [14]:
prices.product()

87.0592

In [12]:
prices.mean()

5.12

In [13]:
prices.std()

3.3820703718284753

## Intro to Attributes
- An **attribute** is a piece of data that lives on an object.
- An **attribute** is a fact, a detail, a characteristic of the object.
- Access an attribute with `object.attribute` syntax.
- The `size` attribute returns a count of the number of values in the **Series**.
- The `is_unique` attribute returns True if the **Series** has no duplicate values.
- The `values` and `index` attributes return the underlying objects that holds the **Series'** values and index labels.

In [15]:
adjectives= pd.Series(['Smart', 'Handsome', 'Charming', 'Brilliant', 'Humble', 'Smart'])
adjectives

0        Smart
1     Handsome
2     Charming
3    Brilliant
4       Humble
5        Smart
dtype: object

In [16]:
adjectives.size

6

In [17]:
adjectives.is_unique

False

In [19]:
adjectives.values

array(['Smart', 'Handsome', 'Charming', 'Brilliant', 'Humble', 'Smart'],
      dtype=object)

In [21]:
adjectives.index

RangeIndex(start=0, stop=6, step=1)

In [24]:
type(adjectives.values) # an element from another library       

numpy.ndarray

In [25]:
type(adjectives.index)

pandas.core.indexes.range.RangeIndex

## Parameters and Arguments
- A **parameter** is the name for an expected input to a function/method/class instantiation.
- An **argument** is the concrete value we provide for a parameter during invocation.
- We can pass arguments either sequentially (based on parameter order) or with explicit parameter names written out.
- The first two parameters for the **Series** constructor are `data` and `index`, which represent the values and the index labels.

In [26]:
fruits= ['Apple', 'Watermelon', 'Orange', 'Grapefruit', 'Pineapple', 'Strawberry']
weekdays= ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']

In [30]:
pd.Series(fruits)
pd.Series(weekdays)
pd.Series(fruits, weekdays)
pd.Series(weekdays, fruits)

Apple            Monday
Watermelon      Tuesday
Orange        Wednesday
Grapefruit     Thursday
Pineapple        Friday
Strawberry     Saturday
dtype: object

In [32]:
pd.Series(data= fruits, index= weekdays)
pd.Series(index= weekdays, data= fruits)

Monday            Apple
Tuesday      Watermelon
Wednesday        Orange
Thursday     Grapefruit
Friday        Pineapple
Saturday     Strawberry
dtype: object

In [33]:
pd.Series(fruits, index= weekdays)

Monday            Apple
Tuesday      Watermelon
Wednesday        Orange
Thursday     Grapefruit
Friday        Pineapple
Saturday     Strawberry
dtype: object

In [34]:
pd.Series(fruits)

0         Apple
1    Watermelon
2        Orange
3    Grapefruit
4     Pineapple
5    Strawberry
dtype: object

In [35]:
pd.Series()

Series([], dtype: object)

## Import Series with the pd.read_csv Function
- A **CSV** is a plain text file that uses line breaks to separate rows and commas to separate row values.
- Pandas ships with many different `read_` functions for different types of files.
- The `read_csv` function accepts many different parameters. The first one specifies the file name/path.
- The `read_csv` function will import the dataset as a **DataFrame**, a 2-dimensional table.
- The `usecols` parameter accepts a list of the column(s) to import.
- The `squeeze` method converts a **DataFrame** to a **Series**.

In [37]:
df= pd.read_csv('google_stock_price.csv')
df.head()

Unnamed: 0,Date,Price
0,2004-08-19,2.490664
1,2004-08-20,2.51582
2,2004-08-23,2.758411
3,2004-08-24,2.770615
4,2004-08-25,2.614201


## The head and tail Methods
- The `head` method returns a number of rows from the top/beginning of the `Series`.
- The `tail` method returns a number of rows from the bottom/end of the `Series`.

## Passing Series to Python's Built-In Functions
- The `len` function returns the length of the **Series**.
- The `type` function returns the type of an object.
- The `list` function converts the **Series** to a list.
- The `dict` function converts the **Series** to a dictionary.
- The `sorted` function converts the **Series** to a sorted list.
- The `max` function returns the largest value in the **Series**.
- The `min` function returns the smalllest value in the **Series**.

## Check for Inclusion with Python's in Keyword
- The `in` keyword checks if a value exists within an object.
- The `in` keyword will look for a value in the **Series's** index.
- Use the `index` and `values` attributes to access "nested" objects within the **Series**.
- Combine the `in` keyword with `values` to search within the **Series's** values.

## The sort_values Method
- The `sort_values` method sorts a **Series** values in order.
- By default, pandas applies an ascending sort (smallest to largest).
- Customize the sort order with the `ascending` parameter.

## The sort_index Method
- The `sort_index` method sorts a **Series** by its index.
- The `sort_index` method also accepts an `ascending` parameter to set sort order.

## Extract Series Value by Index Position
- Use the `iloc` accessor to extract a **Series** value by its index position.
- `iloc` is short for "index location".
- Python's list slicing syntaxes (slices, slices from start, slices to end, etc.) are supported with **Series** objects.

## Extract Series Value by Index Label
- Use the `loc` accessor to extract a **Series** value by its index label.
- Pass a list to extract multiple values by index label.
- If one index label/position in the list does not exist, Pandas will raise an error.

## The get Method on a Series
- The `get` method extracts a **Series** value by index label. It is an alternative option to square brackets.
- The `get` method's second argument sets the fallback value to return if the label/position does not exist.

## Overwrite a Series Value
- Use the `loc/iloc` accessor to target an index label/position, then use an equal sign to provide a new value.

## The copy Method
- A **copy** is a duplicate/replica of an object.
- Changes to a copy do not modify the original object.
- A **view** is a different way of looking at the *same* data.
- Changes to a view *do* modify the original object.
- The `copy` method creates a copy of a pandas object.

## Math Methods on Series Objects
- The `count` method returns the number of values in the **Series**. It excludes missing values; the `size` attribute includes missing values.
- The `sum` method adds together the **Series's** values.
- The `product` method multiplies together the **Series's** values.
- The `mean` method calculates the average of the **Series's** values.
- The `std` method calculates the standard deviation of the **Series's** values.
- The `max` method returns the largest value in the **Series**.
- The `min` method returns the smallest value in the **Series**.
- The `median` method returns the median of the **Series** (the value in the middle).
- The `mode` method returns the mode of the **Series** (the most frequent alue).
- The `describe` method returns a summary with various mathematical calculations.

## Broadcasting
- **Broadcasting** describes the process of applying an arithmetic operation to an array (i.e., a **Series**).
- We can combine mathematical operators with a **Series** to apply the mathematical operation to every value.
- There are also methods to accomplish the same results (`add`, `sub`, `mul`, `div`, etc.)

## The value_counts Method
- The `value_counts` method returns the number of times each unique value occurs in the **Series**.
- The `normalize` parameter returns the relative frequencies/percentages of the values instead of the counts.

## The apply Method
- The `apply` method accepts a function. It invokes that function on every `Series` value.

## The map Method
- The `map` method "maps" or connects each **Series** values to another value.
- We can pass the method a dictionary or a **Series**. Both types connects keys to values.
- The `map` method uses our argument to connect or bridge together the values.