# M07 Notes

# M05 Quiz

![image.png](attachment:66b151c2-85b9-4aab-a2c9-26f2bf361c9d.png)

* Shape is represented here without parentheses, so there is no need to use the comma on a single value.
* The answer is clearly not two values. This is the salient point. Even if this were a syntactic error, the logical answer would not be two values.

# NumPy

- **Structured arrays** can actually store mixed data types.
    - Apparently, this has been a feature of NumPy since version 1.0, released in 2006. 
    - It's odd, then, that NumPy arrays are often introduced as requiring a single data type. 🤔 
- Use `df.to_numpy()` to convert Pandas dataframes to NumPy data structures.

Here is an example.

We create a list of tuples of mixed data, one tuple per row of data.

In [1]:
import numpy as np

In [2]:
data = [
    ('Alice', 25, 55.0), 
    ('Bob', 32, 60.5),
    ('Sri', 39, 70.)
]

We also create a list of tuples for each column, specifying name and data type.

In [3]:
dtypes = [('name', 'U10'), ('age', 'i4'), ('weight', 'f4')]

We pass these to the NumPy's array constructor.

In [4]:
people = np.array(data, dtype=dtypes)

This returns a structured array.

In [81]:
people

array([('Alice', 25, 55. ), ('Bob', 32, 60.5), ('Sri', 39, 70. )],
      dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f4')])

Data may be accessed using column names.

In [82]:
people['name']

array(['Alice', 'Bob', 'Sri'], dtype='<U10')

We see that its data type is just an `ndarray`.

In [83]:
type(people)

numpy.ndarray

We an also access the data type list as an attribute of the array.

In [84]:
people.dtype

dtype([('name', '<U10'), ('age', '<i4'), ('weight', '<f4')])

In Pandas, we can convert data back to a NumPy data structure with `df.to_numpy()`.

This is preferrable to `df.values()`.

In [85]:
import pandas as pd

Here we convert the NumPy array to a Pandas dataframe.

In [86]:
df = pd.DataFrame(data)

In [87]:
df

Unnamed: 0,0,1,2
0,Alice,25,55.0
1,Bob,32,60.5
2,Sri,39,70.0


Then we convert back to a NumPy array ...

In [88]:
npa = df.to_numpy()

In [89]:
npa

array([['Alice', 25, 55.0],
       ['Bob', 32, 60.5],
       ['Sri', 39, 70.0]], dtype=object)

In [90]:
type(npa)

numpy.ndarray

Interestingly, the datatype is different.

It's now a Python object.

In [91]:
npa.dtype

dtype('O')

# Lutz

Why use classes?

> Because using classes well requires some **up-front planning**, they tend to be of more interest to people who work in **strategic mode** (doing long-term product development) than to people who work in tactical mode (where time is in very short supply).

Introduces principle of **composition**: Use objects as components that are combined to create a solution.

Lutz foregrounds inheritance, but I consider the ideas of encapsulation and composition (above) as primary.

# Class Attributes

https://ontoligent.github.io/DS5100-2024-01-O/notebooks/M07_PythonClasses/M07-04-ClassAttributeWeirdness.html

In [53]:
class Foo(): 
    x = 1
    y = []

In [50]:
foo1 = Foo()

In [51]:
foo1.x = 2

In [52]:
Foo.x, foo1.x

(1, 2)

In [37]:
foo2 = Foo()

In [38]:
Foo.x = 2

In [39]:
Foo.x, foo2.x

(2, 2)

In [None]:
# foo1.y = [10]

In [40]:
foo1.y.append(10)

In [41]:
Foo.y

[10]

In [42]:
foo1.y

[10]

In [45]:
foo2.y

[10]