In [1]:
!pip install multiset

Collecting multiset
  Downloading https://files.pythonhosted.org/packages/a8/12/813a649f5bc9801865dc6cda95b8f169f784d996322db192907ebe399064/multiset-2.1.1-py2.py3-none-any.whl
Installing collected packages: multiset
Successfully installed multiset-2.1.1
[33mYou are using pip version 9.0.3, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [5]:
pwd

'/home/jovyan/2018-summer/src/ipynb'

In [6]:
from multiset import Multiset
from IPython.display import display

#### What is the difference between these two syntaxes?
- `list((1,2,3,4))`
- `[1,2,3,4]`

In [8]:
assert list((1,2,3,4)) == [1,2,3,4]

YOUR ANSWER HERE

#### Write a list comprehension

In the cell below write a function that uses a list comprehension that takes a numerical list and replaces all of the even numbers with 0. 

You will receive:

- five point for passing the base case ie. writing a function that does the above
- five points if you write error handling should the function be handed a non-numeric list
   - to pass the test, if the function receives a list that contains non-numeric elements, simply return the string `"Can not transform list with non-numeric elements"`.

Your function should look like this:

    def change_evens_to_zeros(lst):
        # do some things
        return modified_lst

In [31]:
l = [1,2,3, "four"]
[type(i) for i in l]

[int, int, int, str]

In [39]:
set((1,1,1,1,1,1,1,1,1,2,3))

{1, 2, 3}

In [48]:
def change_evens_to_zeros(lst):

    list_of_types = [type(i) for i in lst]
    
    for typ in set(list_of_types):
        if typ not in [int, float, bool]:
            return "Can not transform list with non-numeric elements"
    
    n = len(lst)                    
    new_list = [lst[i] if i % 2 == 1 else 0
                for i in range(n)]
        
    return new_list

In [53]:
change_evens_to_zeros([1,2,3,"four"])

'Can not transform list with non-numeric elements'

In [54]:
change_evens_to_zeros([1,2,3,4])

[0, 2, 0, 4]

In [50]:
assert change_evens_to_zeros(range(10)) == [0, 1, 0, 3, 0, 5, 0, 7, 0, 9]

In [51]:
assert change_evens_to_zeros([1,2,3,4,"five"]) == "Can not transform list with non-numeric elements"

In [52]:
som_list = [1, 2, True, 'foo', ("I'm", 'a', 'tuple')]

#### FREE RESPONSE

Why do you think it would be useful to have a compound data type that requires all elements to have the same primitive data type?

In [None]:
# your code here
raise NotImplementedError

# Lists

## Primitive Data Types 

Primitive data types are the most basic datatypes available to a programming language. The most common primitive datatypes in Python are:

In [55]:
display(type(1))
display(type(1.))
display(type(True))
display(type(None))
display(type(1+1.j))
display(type('None'))
display(type("No'n'e"))
display(type("""

mult-line string

"""))

int

float

bool

NoneType

complex

str

str

str

## Compound Data Types

A compound data type is a data type that is made up of one or more primitive data types.

Let's consider three compound data types:
- `set`
- `Multiset`
- `list`

The **length** of an instance of a compound data type is the number of elements in that instance.

The length of a `set` is the number of *unique* elements in that set.

In [56]:
display(len(set((1,1,2,3))))
display(len(set((1,2,3))))

3

3

The length of a `Multiset` is the number of elements in the `Multiset`.

In [57]:
display(len(Multiset((1,1,2,3))))
display(len(Multiset((1,2,3))))

4

3

The length of a `list` is the number of elements in the `list`.

In [58]:
display(len(list((1,1,2,3))))
display(len(list((1,2,3))))

4

3

#### So what is the difference between a `Multiset` and a `list`?

In addition to having a length, a list has an **order**. 

Neither a `set` nor a `multiset` have order.

In [59]:
set((1,1,2,3)) == set((1,2,1,3))

True

In [60]:
Multiset((1,1,2,3)) == Multiset((1,2,1,3))

True

In [61]:
list((1,1,2,3)) == list((1,2,1,3))

False

Two lists are considered equal if and only if they have the same elements in the same order:

In [62]:
list((1,2,3,4)) == [1,2,3,4]

True

## Mathematical Definition of a List

**Definition**: *list*, *length*

A list of length $n$ is an ordered collection of $n$ items separated by commas and surrounded by brackets. 

We might be tempted to think of a list as a list of numbers

In [63]:
a_list = [1,2,3,4]
a_list

[1, 2, 3, 4]

But in Python, a list is not restricted to only numbers, and is not even restricted to everything being the same primitive type

In [64]:
another_list = ["one", 2., "III", 4]
another_list

['one', 2.0, 'III', 4]

Both of these are lists and have a length.

In [65]:
print("The length of a_list is %s." % len(a_list))
print("The length of another_list is %s." % len(another_list))

The length of a_list is 4.
The length of another_list is 4.


It is worth noting that Python lists are indexed from 0 so that the first element is found using the index 0

In [66]:
a_list[0]

1

In [67]:
another_list[2]

'III'

## List Comprehensions

In this next section, we will use a list comprehension as a tool in our programming. A list comprehension is used to transform compound objects. They take this form

    [do_something_to(placeholder_var) for placeholder_var in compound_object] 

For example, we might trivially wish to convert the list `[1,2,3,4]` to the list `[2,3,4,5]`. This could be done using a list comprehension as

    transformed_list = [v+1 for v in [1,2,3,4]]

We should think of list comprehensions as **vectorized operations** in that we are applying some operation to the entire list, not just one element in the list.

### ADVANCED: `if` and `if-else` statements in list comprehensions

It is trivial to use `if` syntax to filter values from a list comprehension.

    [v for v in compound_object if condition_is_true]

In [None]:
[v for v in [1,2,3,4] if v > 2]

Notice that this list only returns the values greater than 2. 

We could even modify these values.

In [None]:
[v**2 for v in [1,2,3,4] if v > 2]

`if-else` syntax is a bit trickier, but can actually be used to conditionallly modify a list using a list comprehension.

    [v if condition_is_true else other_val for v in compound_object]

In [None]:
[a if a > 2 else 0 for a in (1,2,3)]

In [None]:
[a**2 if a > 2 else a - 4 for a in [1,2,3,4]] 

## Homogenous Lists

Mathematically, we typically require that lists contain values all of the same primitive data type. What if we want to define a list where this is true?


Here, we define a new class `hlist` that inherits from the original `list` class. It initializes the `list` using the same method 

    list.__init__(self, *args)
    
but then checks the types of everything in the list

    types = set([type(v) for v in self])
    
by creating a set of the datatypes of the elements of the list. If all of the data types are the same, the `set` will have a length of 1! If the set of data types does not have a length of 1, this means that we have more than one data type in the list. In this case, we `raise` the error `TypeError("All elements of the list must have the same type.")`.

In [89]:
class SomeClass(object):
    def __init__(self, arg1, arg2):
        self.arg1 = arg1
        self.arg2 = arg2
        
    def increment_arg_by(self, num):
        self.arg1 += num
        
    def cat_arg(self, string):
        self.arg2 += string

In [97]:
import sys

In [98]:
sys.version

'3.6.5 | packaged by conda-forge | (default, Apr  6 2018, 13:39:56) \n[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)]'

In [96]:
my_instance_of_some_class = SomeClass(2)

TypeError: __init__() missing 1 required positional argument: 'arg2'

In [91]:
my_instance_of_some_class.arg1, my_instance_of_some_class.arg2

(2, 'foo')

In [95]:
my_instance_of_some_class.increment_arg_by(5)
my_instance_of_some_class.cat_arg()

TypeError: cat_arg() missing 1 required positional argument: 'string'

In [94]:
my_instance_of_some_class.arg1, my_instance_of_some_class.arg2

(12, 'foo Hello')

In [102]:
types = [int, int, int]
len(set(types))

1

In [104]:
class hlist(list):
    def __init__(self, *args):
        list.__init__(self, *args)

        types = set([type(v) for v in self])
        if len(types) != 1:
            raise TypeError("All elements of the list must have the same type.")

##### Let's test it out

In [105]:
try:
    this_list = hlist((1,2,3,4))
    print(this_list)
except TypeError as e:
    print(e)

[1, 2, 3, 4]


In [106]:
try:
    a_bad_list = hlist((1,2,"three"))
    print(a_bad_list)
except TypeError as e:
    print(e)

All elements of the list must have the same type.


In [107]:
try:
    a_string_list = hlist(("one", "II", "three"))
    print(a_string_list)
except TypeError as e:
    print(e)

['one', 'II', 'three']


Are mixed number types ok, though?

In [108]:
try:
    mixed_number_list = hlist((1., 2, 3))
    print(mixed_number_list)
except TypeError as foo:
    print(foo)

All elements of the list must have the same type.


Mixed number typs are causing the `hlist` to `raise` an error. 

In [109]:
set([type(v) for v in (1.,1.,3)])

{float, int}

We can see that the set of data types includes two types, `int` and `float`.

Let's modify the homogenous list class to allow for multiple primitive types *if* they are all numeric.

We will take the following as numeric types:

- `int`
- `float`
- `complex`

Here, we form a set of the data types in our list, but before we do this, we use an `if`-`else` list comprehension to change label for all numeric types to the string `'numeric'`. Consider this

In [None]:
some_list = [1.,2,4,1+0j,'foo']

In [None]:
types = [type(v) for v in some_list]
types

In [None]:
{'a':1, 'b':1}

In [None]:
class SomeClass():
    

In [None]:
def some_func(*args):
    return list(args)

```
for i in 
```


In [None]:
some_func('a',[1,2,3],{'a':1},{1,2,3})

In [None]:
some_func(1,2,3,'a','n')

In [None]:
from sklearn.tree import DecisionTreeClassifier

In [None]:
tree = DecisionTreeClassifier()

In [None]:
{int, float, complex} == set((int, float, complex))

In [None]:
numeric_types = {int, float, complex}

In [None]:
types = ['numeric' if t in numeric_types else t 
         for t in types]
types

In [None]:
set(types)

In [110]:
class hlist(list):
    def __init__(self, *args):
        list.__init__(self, *args)

        numeric_types = {int, float, complex}
        
        types = [type(v) for v in self]
        types = ['numeric' if t in numeric_types 
                       else t
                 for t in types]
        unique_types = set(types)
        
        if len(unique_types) != 1:
            raise TypeError("All elements of the list must have the same type.")

In [111]:
mixed_number_list = hlist((1., 2, 3))

In [112]:
mixed_number_list

[1.0, 2, 3]

In [113]:
import numpy as np 

In [118]:
arr = np.array((1,2,True))

In [121]:
type(arr)

numpy.ndarray

In [120]:
arr.astype(bool)

array([ True,  True,  True], dtype=bool)

In [None]:
np.array((1E-100, 0, 0+1j, False), dtype=bool)