# PYTHON FOR DATA ANALYSIS

**Patrick Nemeth** 

**March 26, 2024**

# Table of Contents
1. [Introduction](#Introduction)
2. [Python Language Basics](#Python-Language-Basics)
    1. [Language Semantics](#Language-Semantics)
    2. [Indentation, not braces](#Indentation,-not-braces)
    3. [Everything is an object](#Everything-is-an-object)
    4. [Comments](#Comments)
    5. [Function and object method calls](#Function-and-object-method-calls)
    6. [Variables and argument passing](#Variables-and-argument-passing)
    7. [Dynamic references, strong types](#Dynamic-references,-strong-types)
    8. [Attributes and methods](#Attributes-and-methods)
    9. [Duck typing](#Duck-typing)
    10. [Imports](#Imports)
    11. [Binary operators and comparisons](#Binary-operators-and-comparisons)
    12. [Mutable and immutable objects](#Mutable-and-immutable-objects)
    13. [Scalar Types](#Scalar-Types)
        1. [Numeric types](#Numeric-types)
        2. [Strings](#Strings)
        3. [Bytes and Unicode](#Bytes-and-Unicode)
        4. [Booleans](#Booleans)
        5. [Type casting](#Type-casting)
        6. [None](#None)
        7. [Dates and times](#Dates-and-times)
    14. [Control Flow](#Control-Flow)
        1. [if, elif, and else](#if,-elif,-and-else)
        2. [for loops](#for-loops)

## 1. Introduction
Hi, in this series of notebooks I will be working through O'Reilly's "Python For Data Analysis, 3rd edition" by Wes McKinney.  The big idea is to reenforce the subjects that were covered in my Data Science BootCamp, and perhaps learn something new.  Here goes!

#### Tab Completion 

Pressing the `tab` key will search the "namespace" for any variables (objects, functions, etc.) matching the characters you have typed and show the results in a dop-down menu:

(A namespace in Python is a system that ensures that all the names in a program are unique and can be used without conflict. Namespaces are implemented as Python dictionaries, with the key being the name and the value being the corresponding object. This system allows Python to differentiate between identifiers such as variable names, function names, class names, etc., even if they have the same name but are in different namespaces.)

In [None]:
an_apple = 27
an_example = 42

'''
an<press `tab` key>
see the drop-down list of variables starting with "an"

For methods and attributes, use a `period`, then `tab`. When you type b. and then press the Tab key after a list object b, a drop-down list will appear, showing all the methods (functions) and attributes available for lists in Python.:

In [None]:
b = [1, 2, 3]


'''
b.<press `tab` key>
see the drop-down list of functions available for lists

The same is true for modules:

In [None]:
import datetime

'''
datetime.<tab>
see a drop-down list of the attributes and methods available in the datetime module

This also works with file paths and fuction keyword arguments (more on this later). 

#### INTROSPECTION

Use a `?` before or after a variable to display some general info about the object:

In [3]:
b = [1, 2, 3]
b?

'''
b<?>
'''

'\nb<?>\n'

[0;31mType:[0m        list
[0;31mString form:[0m [1, 2, 3]
[0;31mLength:[0m      3
[0;31mDocstring:[0m  
Built-in mutable sequence.

If no argument is given, the constructor creates a new empty list.
The argument must be an iterable if specified.

In [4]:
print?

[0;31mSignature:[0m [0mprint[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0msep[0m[0;34m=[0m[0;34m' '[0m[0;34m,[0m [0mend[0m[0;34m=[0m[0;34m'\n'[0m[0;34m,[0m [0mfile[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0mflush[0m[0;34m=[0m[0;32mFalse[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Prints the values to a stream, or to sys.stdout by default.

sep
  string inserted between values, default a space.
end
  string appended after the last value, default a newline.
file
  a file-like object (stream); defaults to the current sys.stdout.
flush
  whether to forcibly flush the stream.
[0;31mType:[0m      builtin_function_or_method

This is called "object introspection"

In [9]:
def add_numbers(a, b):
    ''' 
    Add two numbers together.

    Returns
    -------
    the_sum : type of arguments
    '''
    return a + b


In [None]:
then, use the `?`:

In [10]:
add_numbers?

[0;31mSignature:[0m [0madd_numbers[0m[0;34m([0m[0ma[0m[0;34m,[0m [0mb[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Add two numbers together.

Returns
-------
the_sum : type of arguments
[0;31mFile:[0m      /var/folders/4j/ygh5mh6j1vn2g2996l16p1d40000gn/T/ipykernel_63057/3723837611.py
[0;31mType:[0m      function

Finally, characters  combined with the wildcard `*` will show all names matching the wildcard expression: 

In [11]:
import numpy as np

np.*load*?

np.__loader__
np.load
np.loadtxt

## 2. Python Language Basics
An overview of essential Python programming concepts and language mechanics.

### 2.1 Language Semantics


#### Indentation

for x in array:
    if x < pivot:
        less.append(x)
    else:
        greater.append(x)

A colon denotes the start of an indented code block after which all of the code must be indented by the same amount until the end of the block. ( Author recommends using 4 spaces for indentation, done automatically in Jupyter notebooks)

#### Everything is an object

Every number, string, data structure, function, class, module, and so on exists in the Python interpreter in its own "box" which is referred to as a Python object. Each object has an associated type (e.g., string or function) and internal data. In practice, objects can have attributes and methods.

#### Comments

Text preceded by a hash mark (#) gets ignored by the Python interpreter. Use this to add comments to your code.

```python
result = []
for line in file_handle:
    # keep the empty lines for now
    # if len(line) == 0:
    #    continue
    result.append(line.replace('foo', 'bar'))


#### Function and object method calls

A function (A **function** in Python is a reusable block of code that performs a specific task.) is called using parethesis and passing zero or more arguments. Optionlly, you can assign the result to a variable:

```
result = f(x, y, z)
g()
```

Nearly every object in Python has attached functions, known as methods, that have access to the object's internal data. You can call them using the following syntax:

```
obj.some_method(x, y, z)
```

Functions can take both positional and keyword arguments:

```
result = f(a, b, c, d=5, e='foo')
```

#### Variables and argument passing

When assigning a variable (a name) in Python, you are creating a named reference to the object on the right-hand side of the equals sign.



In [3]:
a = [1, 2, 3]

# let's assign a variable to "b"

b = a
b

# Since b is a reference to a, b = [1, 2, 3]

[1, 2, 3]

"a" and "b" now refer to the same object, [1, 2, 3]. 

Assignment is also referred to as "binding", as we are binding a name to an object.  Variable names may be referred to as "bound variables". 

Understanding the semantics of references in Python, and when, how, and why data is copied is especially important when working with larger datasets.

When you pass an object (like a list, dictionary, or any custom object) as an argument to a function, Python does not create a copy of the object. Instead, it creates a new local variable inside the function, which references the same object in memory as the original. This means changes made to the object inside the function affect the object outside the function, because both the local variable inside the function and the variable used to call the function point to the same object in memory.

Let's use the function:

In [1]:
def append_element(some_list, element): # define a function that appends an element to a list
    some_list.append(element) # append the element to the list

In [2]:
data = [1, 2, 3] # create a list

append_element(data, 4) # append 4 to the list

data # see the list

[1, 2, 3, 4]

#### Dynamic references, strong types

Variables habe no inherent type associated with them; a variable can reference any type of object just by assigning it a new value. Variables are simply names for objects within a particular namespace. The type information is stored in the object itself.

Python is what is called a "strongly typed" language, which means that every object has a specific type (or class), and implicit conversions will occur only in certain obvious circumstances.


Knowing the type of an object is important, and it's useful to be able to write functions that can handle many different kinds of input. You can check the type of an object using the `type` function The isinstance() function in Python is a built-in function that checks if an object is an instance of a particular class or a tuple of classes. It returns a boolean value.


In [3]:
a = 5 # assign 5 to variable "a"

isinstance(a, int) # use "isinstance" to check if "a" is an integer

True

**(TUPLES)** A tuple in Python is a built-in data structure that represents an ordered collection of elements. Tuples are similar to lists, but they are immutable, meaning that once a tuple is created, its elements cannot be modified, added, or removed. This immutability makes tuples a suitable choice for storing a collection of elements that should not change throughout the execution of a program.

The term "tuple" comes from mathematics, particularly from the branch of abstract algebra dealing with sequences. In mathematics, a tuple is a finite ordered list (sequence) of elements. In Python, a tuple is a data structure that serves a similar purpose.


**Characteristics of Tuples**

**Ordered**: Tuples maintain the order of the elements inside them. The first element you add is the first element in the tuple, and so on.

**Immutable**: Once a tuple is created, you cannot change its contents - no adding, removing, or altering elements.

**Indexed**: Elements in a tuple can be accessed using their index, with the first index being 0, similar to lists.

**Heterogeneous**: Tuples can contain elements of different data types, including integer, float, string, and even other compound objects like lists, dictionaries, or other tuples.

**Iterable**: Tuples are iterable, meaning you can loop over them with a for loop.

If you want to check whether an object belongs to one of several types, you can pass a tuple of types as the second argument to `isinstance()`. 

In [5]:
a = 5; b = 4.5 # assign 5 to "a" and 4.5 to "b"

isinstance(a, (int, float)) # check if "a" is an integer or a float


True

In [6]:
isinstance(b, (int, float)) # check if "b" is an integer or a float


True

#### Attributes and methods

Objects in Python typically have both attributes (other Python objects stored "inside" the object) and methods (functions associated with an object that can have access to the object's internal data). Both of these are accessed via the syntax `obj.attribute_name`:


# GPT KLUDGE ZONE 

### 2.2 Indentation, not braces
Discuss Python's use of whitespace for code structuring instead of braces.

### 2.3 Everything is an object
Explain Python's object model and how everything in Python is treated as an object.

### 2.4 Comments
Discuss comments in Python and their usage.

### 2.5 Function and object method calls
Explain how functions and methods are called in Python.

### 2.6 Variables and argument passing
Discuss variable assignment and argument passing in Python.

### 2.7 Dynamic references, strong types
Explain Python's dynamic typing and strong typing.

### 2.8 Attributes and methods
Discuss accessing attributes and methods of objects in Python.

### 2.9 Duck typing
Explain the concept of duck typing in Python.

### 2.10 Imports
Discuss importing modules in Python.

### 2.11 Binary operators and comparisons
Explain common binary operators and comparisons in Python.

### 2.12 Mutable and immutable objects
Discuss mutable and immutable objects in Python.

### 2.13 Scalar Types
Discuss Python's scalar types, including numeric types, strings, bytes, and booleans.

### 2.14 Control Flow
Discuss control flow statements in Python, including if, elif, else, and for loops.

