# **Python Data Structures and NumPy**

This notebook explores fundamental Python data structures (dictionaries and sets) and demonstrates the use of NumPy for array manipulation.

# **Dictionaries**

Dictionaries are versatile for storing key-value pairs.  The code demonstrates:

*   Creating and printing a dictionary.
*   Adding, modifying, and deleting key-value pairs.
*   Checking key existence.
*   Accessing values, keys, and items.
*   Using dictionaries to represent tabular data.
*   Working with nested dictionaries.


# **Sets**

Sets are unordered collections of unique elements.  The code shows:

*   Creating a set.
*   Adding and removing elements.
*   Set operations (union, intersection, difference).
*   Removing duplicates from a list using sets.

# **NumPy Arrays**

NumPy provides efficient array operations. The notebook covers:

*   Creating arrays from lists.
*   Multi-dimensional arrays.
*   Array attributes (size, shape, data type).
*   Array slicing and indexing.
*   Array manipulation (reshaping, transposing, flattening).
*   Statistical functions (sum, mean, standard deviation).
*   Array arithmetic and matrix operations.
*   Array concatenation.


# **Code Examples**

The notebook includes comprehensive code examples illustrating the concepts mentioned above.  These examples showcase how to use dictionaries, sets, and NumPy arrays for various data manipulation tasks.

# **Dictionaries**

In [2]:
my_dict = {'name': 'John Wick',
           'age': 13,
           'employer': 'the High Table'}

print(my_dict)

{'name': 'John Wick', 'age': 13, 'employer': 'the High Table'}


In [3]:
my_dict['city'] = 'NYC'
my_dict

{'name': 'John Wick', 'age': 13, 'employer': 'the High Table', 'city': 'NYC'}

In [4]:
my_dict['age']

13

In [5]:
my_dict['new_key'] = 'New Value'
my_dict

{'name': 'John Wick',
 'age': 13,
 'employer': 'the High Table',
 'city': 'NYC',
 'new_key': 'New Value'}

In [6]:
del my_dict['new_key']

In [7]:
my_dict

{'name': 'John Wick', 'age': 13, 'employer': 'the High Table', 'city': 'NYC'}

In [8]:
'name' in my_dict

True

In [9]:
13 in my_dict

False

In [11]:
my_dict.values()

dict_values(['John Wick', 13, 'the High Table', 'NYC'])

In [12]:
13 in my_dict.values()

True

In [13]:
my_dict.keys()

dict_keys(['name', 'age', 'employer', 'city'])

In [14]:
len(my_dict)

4

To get every key, value pairs of the dictionary as tuples we can use .item()

In [15]:
my_dict.items()

dict_items([('name', 'John Wick'), ('age', 13), ('employer', 'the High Table'), ('city', 'NYC')])

Let's store tabular data in a dictionary

In [17]:
data = {'name': ['John Wick', 'John Grisham', 'John Travolta'],
        'age': [13, 12, 14],
        'city': ['NYC', 'LA', 'SF']}
data

{'name': ['John Wick', 'John Grisham', 'John Travolta'],
 'age': [13, 12, 14],
 'city': ['NYC', 'LA', 'SF']}

In the above, keys act as column names and values are the records in those columns

# **Nested Disctionaries**

In [18]:
nested_dict = {'dictA':{'keyA1': 'valueA1',
                        'keyA2': 'valueA2'},
               'dictB': {'keyB1': 'valueB1',
                        'keyB2': 'valueB2'}}

print(nested_dict)

{'dictA': {'keyA1': 'valueA1', 'keyA2': 'valueA2'}, 'dictB': {'keyB1': 'valueB1', 'keyB2': 'valueB2'}}


In [19]:
nested_dict['dictB']

{'keyB1': 'valueB1', 'keyB2': 'valueB2'}

In [20]:
del nested_dict['dictB']
nested_dict

{'dictA': {'keyA1': 'valueA1', 'keyA2': 'valueA2'}}

# **Sets**

Sets are immutable and can not contain duplicate values. We can add and removes values from a set. Sets do not support indexing

In [21]:
my_set = {1,2,3,4,5,6}
my_set

{1, 2, 3, 4, 5, 6}

In [22]:
my_set.add(10)
my_set

{1, 2, 3, 4, 5, 6, 10}

In [23]:
my_set.remove(10)
my_set

{1, 2, 3, 4, 5, 6}

In [24]:
my_set.pop()

1

In [25]:
my_set

{2, 3, 4, 5, 6}

In [26]:
del my_set(4)
my_set

SyntaxError: cannot delete function call (<ipython-input-26-126bfbc3bf4e>, line 1)

In [27]:
4 in my_set

True

In [28]:
set1 = {1,2,3,4}
set2 = {4,5,6,7}
set1.union(set2) # gives the union of both sets and removes duplicates

{1, 2, 3, 4, 5, 6, 7}

In [29]:
set1.intersection(set2)  # gives the intersection of both sets.

{4}

In [31]:
set1.difference(set2) # gives the items of set1 that are not in set2

{1, 2, 3}

In [33]:
List = [1,1,2,2,3,3,4,5,6,6,7]

# a list can have duplicates so we can use sets to take care of the duplicate values.
set(List)

{1, 2, 3, 4, 5, 6, 7}

# **Numpy Arrays**

Numpy array is similar to a list as in it stores a sequence of values. However, there are some differences. A list is heterogeneous, meaning it can store items of different data types. Numpy array is homogeneous. Numpy arrays can be 1 dimensional like a list. They can also be 2 dimensional like a table or even more.

In [3]:
import numpy as np

my_list = [1,2,3,4,5]
my_array = np.array(my_list)
my_array

array([1, 2, 3, 4, 5])

In [4]:
another_list = [11,12,13,14,15]
another_array = np.array([my_list, another_list])   # pass a list of lists
another_array

array([[ 1,  2,  3,  4,  5],
       [11, 12, 13, 14, 15]])

In [5]:
type(another_array)

numpy.ndarray

In [6]:
another_array.size  # gives us the size of the array. This has 10 items

10

In [7]:
another_array.shape  # gives us that the array has 2 rows and 5 columns

(2, 5)

In [8]:
another_array.dtype  # gives us the data type in the numpy array which in this case is integer64.

dtype('int64')

In [9]:
another_array

array([[ 1,  2,  3,  4,  5],
       [11, 12, 13, 14, 15]])

Slicing of numpy arrays

In [10]:
another_array[1]

array([11, 12, 13, 14, 15])

In [13]:
another_array[1, 3]   # this grabs the element in row 2 and 4th element as index starts with 0

np.int64(14)

In [12]:
another_array[1, 2:]

array([13, 14, 15])

In [15]:
another_array[::-1]   # this reverses the array just like we use it for reversing list

array([[11, 12, 13, 14, 15],
       [ 1,  2,  3,  4,  5]])

In [17]:
another_array[::-1, ::-1]  # this reverses the array and also reverses the elements inside each row of the array. This is called 180 degree rotation of the 2D array

array([[15, 14, 13, 12, 11],
       [ 5,  4,  3,  2,  1]])

In [27]:
np.identity(5)  # this will give a square identity matrix of size 5 which means 5 rows and 5 columns. Identity matrix is always a square matrix

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [28]:
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [29]:
np.ones([2,3])

array([[1., 1., 1.],
       [1., 1., 1.]])

In [30]:
another_array

array([[ 1,  2,  3,  4,  5],
       [11, 12, 13, 14, 15]])

In [31]:
# let's reshape this 2 by 5 array into 5 by 2

another_array.reshape(5,2)

array([[ 1,  2],
       [ 3,  4],
       [ 5, 11],
       [12, 13],
       [14, 15]])

In [32]:
# let's look at some statistical functions on arrays
np.sum(another_array)

np.int64(80)

In [33]:
np.sum(another_array, axis = 1) # this will give the sum of rows

array([15, 65])

In [35]:
np.sum(another_array, axis = 0) # this will give the sum of each of the 5 columns

array([12, 14, 16, 18, 20])

In [38]:
np.mean(another_array, axis =1, dtype = np.int64)

array([ 3, 13])

In [44]:
np.std(another_array)

np.float64(5.196152422706632)

In [45]:
another_array -100

array([[-99, -98, -97, -96, -95],
       [-89, -88, -87, -86, -85]])

In [46]:
another_array / 2

array([[0.5, 1. , 1.5, 2. , 2.5],
       [5.5, 6. , 6.5, 7. , 7.5]])

In [48]:
another_array.T  # transpose the matrix

array([[ 1, 11],
       [ 2, 12],
       [ 3, 13],
       [ 4, 14],
       [ 5, 15]])

In [52]:
# let's change the matrix into a one dimensional array. Let's make it flat

another_array.flatten()

array([ 1,  2,  3,  4,  5, 11, 12, 13, 14, 15])

In [54]:
column_array = another_array.flatten().T  # this made the 2D array into a 1D array which has 1 column.

In [55]:
column_array.shape

(10,)

In [58]:
# let's do some concatenation of 3 arrays
array1 = np.array([[1,2,3], [3,4,5], [5,6,7]])
array2 = np.array([[11,12,13], [13,14,15], [15,16,17]])
array3 = np.array([[21,22,23], [23,24,25], [25,26,27]])

In [61]:
np.concatenate([array1, array2, array3], axis = 1) # this concatenates the arrays along rows

array([[ 1,  2,  3, 11, 12, 13, 21, 22, 23],
       [ 3,  4,  5, 13, 14, 15, 23, 24, 25],
       [ 5,  6,  7, 15, 16, 17, 25, 26, 27]])

In [62]:
np.concatenate([array1, array2, array3], axis = 0) # this will conctenate them along columns

array([[ 1,  2,  3],
       [ 3,  4,  5],
       [ 5,  6,  7],
       [11, 12, 13],
       [13, 14, 15],
       [15, 16, 17],
       [21, 22, 23],
       [23, 24, 25],
       [25, 26, 27]])

In [65]:
another_array % 2 # shows which numbers are even and which are odd

array([[1, 0, 1, 0, 1],
       [1, 0, 1, 0, 1]])

In [66]:
array1 + array2 + array3

array([[33, 36, 39],
       [39, 42, 45],
       [45, 48, 51]])

In [67]:
array1 - array2 - array3

array([[-31, -32, -33],
       [-33, -34, -35],
       [-35, -36, -37]])

In [68]:
array1 * array2

array([[ 11,  24,  39],
       [ 39,  56,  75],
       [ 75,  96, 119]])

In [69]:
np.dot(array1, array2)

array([[ 82,  88,  94],
       [160, 172, 184],
       [238, 256, 274]])