# Overview of Collections - list and set

Let us get an overview of list and set as part of the Python Collections.

* Overview of list and set
* Common Operations
* Accessing Elements from Lists
* Adding Elements to list
* Updating and Deleting Elements - list
* Other list operations
* Adding and Deleting elements - set
* Typical set operations
* list and set - Usage

## Overview of list and set

There are 4 types of collections in Python. While `list` and `set` fundamentally contain homogeneous elements, `dict` and `tuple` contain heterogeneous elements.
* Homogeneous means of same type.
* Examples of collections with homogeneous elements.
  * Collection of employees - `list`
  * Collection of unique employees - `set`
  * Collection of integers - `list`
  * Collection of unique integers - `set`
* Based up on the requirement we should use appropriate type of collection.
* `list`
  * Group of homogenous elements.
  * There can be duplicates in the `list`.
  * `list` can be created by enclosing elements in `[]` - example `[1, 2, 3, 4]`.
  * Empty `list` can be initialized using `[]` or `list()`.
* `set`
  * Group of homogenous elements
  * No duplicates allowed in the `set`. Even if you add same element more than once, such elements will be ignored.
  * `set` can be created by enclosing elements in `{}` - example `{1, 2, 3, 4}`.
  * Empty `set` can be initialized using `set()`. We cannot initialize empty set using `{}` as it will be treated as empty `dict`.
* `list` and `set` can be analogous to Table with columns and rows while `dict` and `tuple` can be analogous to a row with in a table.
* `list` can hold duplicate values while `set` can only hold unique values.
* If you want to have a row with column names then we use `dict` otherwise we use `tuple`.
* We will deep dive into all types of collections to get better understanding about them.

In [None]:
l = [1, 2, 3, 3, 4, 4]

In [None]:
l

In [None]:
l = []

In [None]:
l

In [None]:
type(l)

In [None]:
l = list()

In [None]:
l

In [None]:
s = {1, 2, 3, 3, 4, 4}

In [None]:
s

In [None]:
type(s)

In [None]:
s = set() # Initializing empty set

In [None]:
s

In [None]:
s = {} # s will be of type dict

In [None]:
type(s)

## Common Operations

There are some functions which can be applied on all collections. Here we will see details related to `list` and `set`.
* `in` - check if element exists
* `len` - to get the number of elements.
* `sorted` - to sort the data (original collection will be untouched). Typically, we assign the result of sorting to a new collection.
* `sum`, `min`, `max`, etc - arithmetic operations.
* There can be more such functions.

In [None]:
l = [1, 2, 3, 4] # list

In [None]:
1 in l

In [None]:
5 in l

In [None]:
len(l)

In [None]:
sorted(l)

In [None]:
sum(l)

In [None]:
s = {1, 2, 3, 4} # set

In [None]:
1 in s

In [None]:
5 in s

In [None]:
len(s)

In [None]:
sorted(s)

In [None]:
sum(s)

## Accessing Elements from list

Let us see how we can access elements from the `list`.
* We can access a particular element in a `list` by using index `l[index]`. Index starts with 0.
* We can also pass index and length up to which we want to access elements using `l[index:length]`
* Index can be negative and it will provide elements from the end. We can get last n elements by using `l[-n:]`.
* Let us see few examples.

In [None]:
l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [None]:
l[0] # getting first element

In [None]:
l[2:4] # get elements from 3rd up to 4 elements

In [None]:
l[-1] # get last element

In [None]:
l[-4:] # get last 4 elements

In [None]:
l[-5:-2] # get elements from 6th to 8th

## Adding Elements to list

We can perform below operations to add elements to the list.
* `append` - to add elements at the end of the list.
* `insert` - to insert an element at the index specified. All the elements from that index will be moved to right side.
* `extend` - to extend the list by appending elements from other list.
* We can also append the list using `+`

In [None]:
l = [1, 2, 3, 4]

In [None]:
l.append?

In [None]:
l.append(5)

In [None]:
l

In [None]:
l = l + [6]

In [None]:
l

In [None]:
l = l + [7, 8, 9, 10]

In [None]:
l

In [None]:
l.insert?

In [None]:
l.insert(3, 13)

In [None]:
l

In [None]:
l.extend?

In [None]:
l.extend([11, 12])

In [None]:
l

## Updating and Deleting Elements - list

Here is how we can update elements in the list as well as delete elements from the list.
* We can assign an element to the list using index to update.
* There are multiple functions to delete elements from list.
  * `remove` - delete the first occurrence of the element from the list.
  * `pop` - delete the element from the list using index.
  * `clear` - deletes all the elements from the list.

In [None]:
l = [1, 2, 3, 4]

In [None]:
l[1] = 11

In [None]:
l

In [None]:
l = [1, 2, 3, 4, 4, 6]

In [None]:
l.remove?

In [None]:
l.remove(4)

In [None]:
l

In [None]:
l.pop?

In [None]:
l.pop()

In [None]:
l

In [None]:
l.pop(2)

In [None]:
l

In [None]:
l.clear()

In [None]:
l

## Other list operations

Here are some of the other list operations we frequently use.
* `count` - number of time an element is present in a list.
* `sort` - to sort the data with in the list. Data in the list will be sorted in-place.

In [None]:
s ='asdfasfsafljojlsdfaljfasf'

In [None]:
l = list(s)

In [None]:
l.count?

In [None]:
l.count('a')

In [None]:
l.count('z')

In [None]:
l.sort?

In [None]:
l.sort()

In [None]:
l

In [None]:
l.reverse?

In [None]:
l.reverse()

In [None]:
l

## Adding and Deleting elements - set

Let us see how we can add and delete elements to the set.
* We can add elements to `set` or update existing ones.
  * `add`
  * `update`
  * `union`
* We can delete elements from the `set` using different functions.
  * `pop`
  * `remove`
  * `discard`
  * `clear`

In [None]:
s = {1, 2, 3, 3, 3, 4, 4}

In [None]:
s.add(5)

In [None]:
s

In [None]:
s.update?

In [None]:
s.update({4, 5, 6, 7}) # Updates the set on which update is invoked

In [None]:
s

In [None]:
s.union?

In [None]:
s = {1, 2, 3, 4, 5}
s.union({4, 5, 6, 7}) # Creates new set

In [None]:
s

In [None]:
s.pop?

In [None]:
s.pop()

In [None]:
s

In [None]:
s.remove?

In [None]:
s.remove(4)

In [None]:
s

In [None]:
s.remove(6) # 6 does not exist, throws KeyError

In [None]:
s = {1, 2, 3, 4, 5}

In [None]:
s.discard?

In [None]:
s.discard(4)

In [None]:
s

In [None]:
s.discard(6)

In [None]:
s

In [None]:
s.clear?

In [None]:
s.clear()

In [None]:
s

## Typical set operations

We typically perform below operations on set. These are typical mathematical set operations.
* `union` - get all unique elements from 2 or more sets.
* `intersection` - get common elements between 2 or more sets.
* `difference` - get operations from one set but not in other set.

All the above functions generate a new set.

In [None]:
s1 = {1, 2, 3, 4}

In [None]:
s2 = {3, 4, 5, 6, 7}

In [None]:
s1.union(s2)

In [None]:
s1.intersection(s2)

In [None]:
s1.difference(s2)

In [None]:
s2.difference(s1)

## Validating set

Here are some of the operations that can be performed to validate sets.

* Checking if an element exists (using in operator).
* `issubset` - checking if first set is subset of second set
* `issuperset` - checking if first set is superset of second set
* `isdisjoint` - check if 2 sets have common elements


In [None]:
s = {1, 2, 3, 3, 4, 4, 4, 5}

In [None]:
1 in s

In [None]:
s1 = {1, 2, 3}

In [None]:
s2 = {1, 2, 3, 4, 5}

In [None]:
s1.issubset?

In [None]:
s1.issubset(s2)

In [None]:
s1.issuperset(s2)

In [None]:
s2.issuperset(s1)

In [None]:
s1.issuperset(s2)

In [None]:
s1.isdisjoint?

In [None]:
s1 = {1, 2, 3, 4}

In [None]:
s2 = {3, 4, 5, 6, 7}

In [None]:
s1.isdisjoint(s2)

In [None]:
s1 = {1, 2, 3, 4}

In [None]:
s2 = {5, 6, 7}

In [None]:
s1.isdisjoint(s2)

## list and set - Usage

Let us see some real world usage of list and set.
* list is used more often than set.
  * Reading data from file into a list
  * Reading data from a table into a list
* We can convert a list to set to perform these operations.
  * Get unique elements from the list
  * Perform set operations between 2 lists such as union, intersection, difference etc.
* We can convert a set to list to perform these operations.
  * Reverse the collection
  * Append multiple collections to create new collections while retaining duplicates
* You will see some of these in action as we get into other related topics down the line

In [None]:
# Reading data from file into a list
path = '/Users/itversity/Research/data/retail_db/orders/part-00000'
# C:\\users\\itversity\\Research
orders_file = open(path)

In [None]:
orders_raw = orders_file.read()

In [None]:
orders = orders_raw.splitlines()

In [None]:
orders[:10]

In [None]:
len(orders) # same as number of records in the file

In [None]:
# Get unique dates
dates = ['2013-07-25 00:00:00.0', '2013-07-25 00:00:00.0', '2013-07-26 00:00:00.0', '2014-01-25 00:00:00.0']

In [None]:
dates

In [None]:
len(dates)

In [None]:
set(dates)

In [None]:
len(dates)

In [None]:
# Creating new collection retaining duplicates using 2 sets
s1 = {'2013-07-25 00:00:00.0', '2013-07-26 00:00:00.0', '2014-01-25 00:00:00.0'}

In [None]:
s2 = {'2013-08-25 00:00:00.0', '2013-08-26 00:00:00.0', '2014-01-25 00:00:00.0'}

In [None]:
s1.union(s2)

In [None]:
len(s1.union(s2))

In [None]:
s = list(s1) + list(s2)

In [None]:
s

In [None]:
len(s)