# Python fundamentals

## Setting up the environment in Ubuntu

### VSCode, Jupyter, and conda

- We'll use the Python extension in VSCode, which includes Jupyter notebooks

- Jupyter notebooks interprets files with the IPython notebook extension ".ipynb", allowing markdown and code cells to be built up

- For Python packages themselves, installing Anaconda is an option, but my preference is to use Miniconda and then build Conda environments for specific tasks. This keeps user control over the packages and versions, limits space usage, minimises software conflicts, and enables portability. It also teaches me more about what's going on (by solving issues!)

- We'll set up a general "Data science with Python" conda environment, and add to it over time (and/or generate additional specialist environments)

```
conda env create -f python-data-science.yml
conda activate python-data-science
```

- I initially added only python3, numpy, and pandas to the yml

- With python3 installed, I then set the Jupyter kernel to point at python inside the conda env (using the VSCode interface)

- Ran a hello world test (which failed), and as a result added the ipykernel package so that Jupyter can execute code `conda install -c conda-forge ipykernel`

In [24]:
# Hello world test

print ("hello world, I was in double quotes")

# Note that quotation marks around strings can be single or double, no difference
print ('hello world, I was in single quotes')

hello world, I was in double quotes
hello world, I was in single quotes


In [25]:
# Sanity check on which python version is being used by the interpreter

from platform import python_version
python_version()


'3.11.4'

### Git repository set up

```
git init -b main
git add README.md python-fundamentals.ipynb
git co -m "Initial commit"
gh repo create
```

### Add Jupyter HTML export functionality (for PDF, I'll use HTML print to PDF)

- Attempted an HTML export, assuming we don't have all the packages: nbconvert was missing
```
conda install -c anaconda nbconvert
```

## Fundamentals of python 1. Background

- Python is a high-level language (high abstraction from the machine, and machine specific commands)
- Hence it's quicker to pickup: it uses natural language and focuses on readability
- "Batteries included language" - i.e., large stdlib - or standard library (pre-written code modules, functions, classes, methods, tools - that come with the language). E.g., the module "platform" that we used above is from the stdlib
- Follows rules of syntax, which will have an impact on how it runs, or if it runs, e.g. the following code works:

In [64]:
# Make a string
my_string = "The gardens are watered!"

# Check the length:
length = len(my_string)
print ('Total length:', length)

# Now make a basic for loop, start at index 0 to the end of the string (length). Move in steps of 4 and print i.
# Out of interest, to actually look at the underlying letters at each i iteration, we can use slice notation.
# [i:i+4] (non-inclusive of the last)

for i in range(0,length,4):
    print ('Current index value of i:', i, '1-based position:', i+1)
    substring = my_string [i:i+4]
    print ("Substring:", substring)

# The last index printed is 20, which refers to position 21. 
# The next step of 4 would go to position 25, which does not exist.


Total length: 24
Current index value of i: 0 1-based position: 1
Substring: The 
Current index value of i: 4 1-based position: 5
Substring: gard
Current index value of i: 8 1-based position: 9
Substring: ens 
Current index value of i: 12 1-based position: 13
Substring: are 
Current index value of i: 16 1-based position: 17
Substring: wate
Current index value of i: 20 1-based position: 21
Substring: red!


In [65]:
# Whereas here it will not run - only due to lack of indentation:

my_string = "The gardens are watered!"
length = len(my_string)
print ('Total length:', length)
for i in range(0,length,4):
print ('Current index value of i:', i, '1-based position:', i+1)

IndentationError: expected an indented block after 'for' statement on line 7 (793192203.py, line 8)

## Fundamentals of python 2. Data types, basic operations, & variables

- Below we'll see and work with basic data types (integers, bools, floats, strings)
- Arithmetic operation syntax
- Variables
- Comparison operators (greater, less than, equal to, etc.)
- Logical operators (and, or, not)
- Control structures

In [88]:
# Basic operations with integers, can be run directly
7+11

18

In [115]:
# Though in an interactive environment, I will only see the last output if I do multiple direct operations.
# Use print to guarentee an output:

7+1  # Output will not appear, as it is not the last. 

# Arithmetic operations

print(7+11)
print(5-1)
print(6*6)
print (10/10)
print() # Output an empty line

# Chain commands using semi-colons. Include within same print using comma. 
# Strings and integers can be combined

print(7); print ("gherkin")
print(7, "gherkin")
print()

# Base/exponent syntax

print(10**3)
print()

# Modulo syntax - returns remainder of a division (left / right)

print (6%6)
print (6%4)
print()

# Compound investment on 1000 GBP, 10% return p/a, held for 10 years
print ("My investment would be worth:", 1000*1.1**10)
print ()

# Working with variables
height = 1.88
weight = 85
BMI = weight/ (height**2)
print ("BMI:", BMI)
savings = 1000
return_rate = 1.1
years = 10
result = savings*return_rate**years
print ("Return:", result)
print ()

# Above we've seen strings and integers
# Type influences how operations work

print ("Integers will be added:", 2+4)
print ("Strings will be concatenated:", "Cam"+"el")
print()

# Do check type, use type()
# Floats - real number with floating point (decimal)
# Bools - yes/no, true/false, 1/0

print(type(1))
print(type("gherkin"))
print(type(1.782))
print(type(True))
print()

18
4
36
1.0

7
gherkin
7 gherkin

1000

0
2

My investment would be worth: 2593.7424601000025

BMI: 24.049343594386603
Return: 2593.7424601000025

Integers will be added: 6
Strings will be concatenated: Camel

<class 'int'>
<class 'str'>
<class 'float'>
<class 'bool'>



In [141]:
# Exploring addition behaviour with floats & bools

print (1.1 + 1.7)
print(False + False) # Two wrongs don't make a right
print(True + False) # Got to take the high ground
print ()

# Some type mixing ok, but not others
print (1.1+2)
print ()

# Type conversion is done using
# str(), int(), float(), bool()
pi_string = "3.1415926"
print(float(pi_string) + 2)
print()

# Comparison operators - result will be a boolean

print(2>5) # Greater than
print(5<5) # Less than
print(5<=5) # Less than or equal
print(77>=567) # Greater than or equal
print(10==10) # Equals
print (True!=False) # Does not equal
print ((10 * 10) > 99) # Include arithmetic
print ()

# On strings
print ("a"<"b") # Alphabetically treated
print ("Python" == "python") # Case sensitive
print ()

# Logical operators / boolean operators (and, or, not)
y = 10
x = 5
z = 2
print (y > x and x > z)
print (x < z or x > y)
print (not True) # not negates a bool

2.8
0
1

3.1

5.1415926

False
False
True
False
True
True
True

True
False

True
False
False


In [172]:
# Control structures

# For loops

for i in ["one", "two", "three"]:
    print(i)

# While loops

counter = 0
non_inclusive_limit = 3

while counter < non_inclusive_limit:
    print(counter)
    counter += 1

# Conditional statements: if

if counter == 3:
    print("Limit reached") 

# Conditional statements: else

if counter == 4:
    print("Condition met")
else:
    print("Condition not met")

# Conditional statements: elif

if counter == 2:
    print ("Not hit")
elif counter == 3:
    print ("Hit")


one
two
three
0
1
2
Limit reached
Condition not met
Hit


### More on loop control / control flow

- Break out of the looping once condition met (final loops won't be done)

`break`

- Skip remainder of current iteration

`continue`

- placeholder that does nothing (e.g., placeholder code) 

`pass`

- exit function returning a value

`return`

## Fundamentals of python 3. Python data structures, containers

### Data structure vs container background

- Data structures define layout and relationship between data elements + operations that can be done on them
- Quite general term, encompasses basic data types (e.g., integers) and complex ones (e.g., arrays)
- Containers on the other hand refers to specific data structures that hold/organise multiple values/objects
- These would be for grouping data coherently, e.g., lists, tuples, and dictionaries
- Each has different operations, rules, and characteristics that make them suitable for different things

### Data structures: Lists

- ordered, mutable, redundant collections
- can contain multiple data types (e.g., integers, booleans, floats, or strings)
- defined by square brackets []
- being ordered, each element has a fixed position (defined by a 0-based numerical index)

In [26]:
# List with four string elements, two redundant, stored in a variable

fruit_list = ['banana', 'apple', 'orange', 'orange']
print(fruit_list)

['banana', 'apple', 'orange', 'orange']


In [27]:
# Access index positions (0-based)

print(fruit_list[0])
print(fruit_list[1])
print(fruit_list[2])
print(fruit_list[3])

banana
apple
orange
orange


In [28]:
# List elements are mutable

# First let's make a copy of the list variable using the python method ".copy"

# We could do:

# modified_fruit_list = fruit_list

# but it would mean modified_fruit_list will refer to the original variable, so both will be edited when one is.

# So we make a new one with the method

modified_fruit_list = fruit_list.copy()

# Now we can edit that one without touching the first

modified_fruit_list[2] = 'grape'

print(modified_fruit_list)
print(fruit_list)

['banana', 'apple', 'grape', 'orange']
['banana', 'apple', 'orange', 'orange']


### Data structures: Tuples
- Similar to lists, but immutable. Still ordered and allowing duplicates.
- Defined by parentheses
- For representing data that is related and should never be modified

In [76]:
my_tuple = (1,2,3,4,5)
print (my_tuple)

# Ordered
print (my_tuple[1])

# Variable setting
first_element = my_tuple[0]
print (first_element)

# Immutable
my_tuple[1] = "gherkin"

(1, 2, 3, 4, 5)
2
1


TypeError: 'tuple' object does not support item assignment

### Data structures: Dictionaries


### Data structures: Sets

- Unordered, unique elements (duplicates will be auto-removed)
- Curley brackets {} or set()
- useful for membership testing and removing duplicates

In [68]:
my_list = [1, 2, 2, 3, 4, 4, 5, 5]
my_set = set(my_list)

print(my_list)
print(my_set)

[1, 2, 2, 3, 4, 4, 5, 5]
{1, 2, 3, 4, 5}


### Data structures: Strings


### Data structures: Arrays


### Data structures: Queues

### Data structures: Others to mention not in stdlib

- Stacks
- Linked lists
- Trees
- Graphs



## Fundamentals of python 3. Functions and modules
input() function
functions in general

## Fundamentals of python 4. Object-Oriented Programming (OOP)



## Fundamentals of python 5. File Handling

- for and while loops on files

## Fundamentals of python 6. Exception Handling

- try
- except
- finally

## Fundamentals of python 7. Debugging and Testing

## Fundamentals of python 8. Basic Algorithms and Logic

## Fundamentals of python 9. Standard library

## Somewhere: Python methods & attributes

- We've used a Python method before, ".copy": `modified_fruit_list = fruit_list.copy()`

- You can use the built-in dir() function to get a list of attributes and methods available for an object, including modules, classes, and instances.

- So below we'll look at that using platform, the module we imported from previously

In [67]:
import platform

# Use dir() function to inspect attributes
platform_dir = dir(platform)
print (platform_dir)

# Run a few
pv = platform.python_version()
print(pv)
pi = platform.python_implementation()
print(pi)

# Note this one is a global function, so can be run without importing platform
python_version()

['_Processor', '_WIN32_CLIENT_RELEASES', '_WIN32_SERVER_RELEASES', '__builtins__', '__cached__', '__copyright__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__version__', '_comparable_version', '_component_re', '_default_architecture', '_follow_symlinks', '_get_machine_win32', '_ironpython26_sys_version_parser', '_ironpython_sys_version_parser', '_java_getprop', '_libc_search', '_mac_ver_xml', '_node', '_norm_version', '_os_release_cache', '_os_release_candidates', '_os_release_line', '_os_release_unescape', '_parse_os_release', '_platform', '_platform_cache', '_pypy_sys_version_parser', '_sys_version', '_sys_version_cache', '_sys_version_parser', '_syscmd_file', '_syscmd_ver', '_uname_cache', '_unknown_as_blank', '_ver_output', '_ver_stages', 'architecture', 'collections', 'freedesktop_os_release', 'functools', 'itertools', 'java_ver', 'libc_ver', 'mac_ver', 'machine', 'node', 'os', 'platform', 'processor', 'python_branch', 'python_build', 'python_com

'3.11.4'

##### 