## Python Language Basics

### Language Semantics

#### Indentation, not braces

Python uses whitespace (tabs or spaces) to structure code instead of using braces as in
many other languages like R, C++, Java, and Perl. Consider a for loop from a sorting algorithm:

A colon denotes the start of an indented code block after which all of the code must be indented by the same amount until the end of the block.

In [25]:
array = [4, 9, 2, 7, 3, 5] # Example list
pivot = 5                  # Example pivot value
less = []                  # Initialize an empty list for values < pivot
greater = []               # Initialize an empty list for values >= pivot
              

for x in array:
    if x < pivot:
        less.append(x)
    else:
        greater.append(x)    

print("Less than pivot:", less)
print("Greater or equal to pivot:", greater)

Less than pivot: [4, 2, 3]
Greater or equal to pivot: [9, 7, 5]


In [26]:
a = 5; b = 6; c = 7 #semicolons are used to separate multiple statements on a single line

#### Comments

They are texts preceded by the  harsh mark (#) and are ignored by the python interpreter.

Sometimes used to exclude certain blocks of code without deleting them.

They can also occur after a line of executed code.

In [None]:
results = []
for line in file_handle:
    # keep the empty lines  for now
    # if len(line) == 0:
    # continue
    results.append(line.replace("foo", "bar"))

NameError: name 'file_handle' is not defined

In [31]:
print("Reached this line") #Simple status report

Reached this line


#### Function and object method calls

Functions can be using parentheses and passing zero or more arguments, optionally assigning the returned value to a variable.



In [37]:
result = f(x, y, z)
g()

(g)


In [36]:
# Define the functions first
def f(x, y, z):
    """A simple function that adds three numbers"""
    return x + y + z

def g():
    """A function with no parameters"""
    print("(g)")

# Define some variables
x = 10
y = 20
z = 30

# Now you can call them
result = f(x, y, z)
print(f"Result: {result}")  # Output: Result: 60

g()  # Output: Hello from function g!

Result: 60
(g)


In [None]:
# Syntax for calling attached functions (Methods) that have access to the object's internal contents
obj.some_method(x, y, z) 

In [42]:
# Create a list object
obj = [1, 2, 3]

# Call methods on it
obj.append(4)  # Method call with one parameter
print(obj)     # Output: [1, 2, 3, 4]

# Define variables for method parameters
x, y, z = 10, 20, 30
obj.extend([x, y, z])  # Method call using variables
print(obj)     

[1, 2, 3, 4]
[1, 2, 3, 4, 10, 20, 30]


In [44]:
# Functions can take both positional and keyword arguments 
result = f(a, b, c, d=5, e="foo")

TypeError: f() got an unexpected keyword argument 'd'

In [45]:
def flexible_function(*args, **kwargs):
    """Function that accepts any number of positional and keyword arguments"""
    print(f"Positional arguments: {args}")
    print(f"Keyword arguments: {kwargs}")
    
    # Process the arguments
    if args:
        total = sum(args)
        print(f"Sum of positional args: {total}")
    
    for key, value in kwargs.items():
        print(f"  {key} = {value}")

# Now you can call it with any arguments
flexible_function(1, 2, 3, d=5, e="foo", name="example")

Positional arguments: (1, 2, 3)
Keyword arguments: {'d': 5, 'e': 'foo', 'name': 'example'}
Sum of positional args: 6
  d = 5
  e = foo
  name = example


#### Variable and Argument Passing

In [46]:
a = [1,2,3]

In [47]:
b = a
b

[1, 2, 3]

In [48]:
a.append(4)

In [49]:
b

[1, 2, 3, 4]

#### Dynamic References, strong types

In [50]:
# a variable can refer to a different type of object simply by doing an assignment.

In [51]:
a = 5

In [52]:
type(a)

int

In [53]:
a = "foo"

In [54]:
type(a)

str

In [55]:
# Variables are names for objects within a particular namespace; the type information is stored in the object itself. Some observers might hastily conclude that Python is not a “typed language.” This is not true;

In [57]:
"5" + 5 # This will raise an error because you cannot concatenate a string and an integer

TypeError: can only concatenate str (not "int") to str

In [58]:
# Python is a strongly typed language, which means that every object has a specific type (or class), and implicit conversions will occur only in certain permitted circumstances, such as:

In [59]:
a = 4.5

In [60]:
b = 2

In [61]:
print(f"a is {type(a)}, b is {type(b)}")

a is <class 'float'>, b is <class 'int'>


In [62]:
# Here, even though b is an integer, it is implicitly converted to a float for the division operation.
a/b

2.25

In [63]:
# using the isinstance function to check that an object is an instance of a particular type

a = 5

In [64]:
isinstance(a, int)

True

In [65]:
# isinstance can accept a tuple of types if you want to check that an object’s type is among those present in the tuple:
a = 7; b = 6.5

In [66]:
isinstance(a,(int, float))

True

In [67]:
isinstance(b,(int, float))

True

#### Attribution and Methods

Objects in Python typically have both attributes (other Python objects stored
“inside” the object) and methods (functions associated with an object that can
have access to the object’s internal data). Both of them are accessed via the syntax obj.attribute_name:

In [68]:
a = "foo"

In [69]:
a.capitalize

<function str.capitalize()>

In [70]:
# Attributes and methods can also be accessed by name via the getattr function:
getattr(a, "split")

<function str.split(sep=None, maxsplit=-1)>

#### Duck Typing

This is a a programming concept where an object's suitability is determined by its ability to perform certain actions (methods or attributes) rather than its declared type. 

For example, you can verify that an object is iterable if it implements the iterator protocol. For many objects,
this means it has an __iter__ “magic method,” though an alternative and better way
to check is to try using the iter function:


In [71]:
def isiterable(obj):
   try:
      iter(obj)
      return True
   except TypeError: # not iterable
     return False

In [72]:
# This function would return True for strings as well as most Python collection types:

In [73]:
isiterable("a string")

True

In [74]:
isiterable([1, 2, 3])

True

In [75]:
isiterable(5)

False

#### Imports

In [76]:
# In Python, a module is simply a file with the .py extension containing Python code

In [77]:
# some_module.py
PI = 3.14159
def f(x):
 return x + 2

def g(a, b):
 return a + b

In [78]:
import os
print(os.getcwd())  # Shows where Python is looking for files

c:\Users\kanyi\Desktop\ipynb


In [None]:
# How to access the variables and functions defined in some_module.py, from another file in the same directory
import some_module
result = some_module.f(5)
pi = some_module.PI

In [None]:
# Or alternatively, you can import specific functions or variables directly:
from some_module import f, PI
from some_module import g, PI
result = g(5, PI)

In [None]:
# By using the as keyword, you can give imports different variable names
import some_module as sm
from some_module import PI as pi, g as gf
r1 = sm.f(pi)
r2 = gf(6, pi)

#### Binary Operators and Comaprisons

Most of the binary math operations and comparisons use familiar mathematical
syntax used in other programming languages

In [87]:
5-7

-2

In [88]:
12 + 21.5

33.5

In [89]:
5<=2

False

In [90]:
# Use the "is" keyword to check if two variables refer to the same object
a = [1, 2, 3]

In [91]:
b = a

In [92]:
c = list(a)

In [93]:
a is b

True

In [94]:
a is not c

True

In [96]:
# Since the list function always creates a new Python list (i.e., a copy), we can be sure that c is distinct from a. Comparing with is is not the same as the == operator, because in this case we have:

a == c  # This checks if the contents are the same, not if they are the same object

True

In [97]:
a is c  # This checks if they are the same object in memory

False

In [98]:
# "is" and "is not" is to check if a variable is None, since there is only one instance of None
a = None

In [99]:
a is None

True

#### Mutable and Immutable Objects

Many objects, such as lists, dictionaries, NumPy arrays, and most userdefined
types (classes), are mutable. This means that the object or values that they
contain can be modified.

Others, like strings and tuples, are immutable, which means their internal data
cannot be changed.

In [100]:
a_list = ["foo", 2, [4, 5]]  # Example list with a mutable inner list

In [102]:
a_list[2] = (3, 4)  # Changing the inner list to a tuple

In [103]:
a_list

['foo', 2, (3, 4)]

In [106]:
# strings and tuples, are immutable, which means their internal data cannot be changed

In [104]:
a_tuple =(3, 5, (4, 5))

In [105]:
a_tuple[1] = "four"

TypeError: 'tuple' object does not support item assignment

### Scalar Types

This refer to a small set of built-in types for handling numerical data, strings, Boolean(True or False) values, and dates and time.

Standard Python scalar types

None -> The Python “null” value (only one instance of the None object exists)

str -> String type; holds Unicode strings

bytes -> Raw binary data

float -> Double-precision floating-point number (note there is no separate double type)

bool -> A Boolean True or False value

int -> Arbitrary precision integer

#### Numeric Types

The primary Python types for numbers are int and float. 
An int can store arbitrarily large numbers.

In [107]:
ival = 17239871

In [108]:
ival ** 6

26254519291092456596965462913230729701102721

#####

Floating-point numbers are represented with the Python float type. Under the hood, each one is a double-precision value. They can also be expressed with scientific notation.

In [109]:
fval = 7.243

In [110]:
fval2 = 6.78e-5

#####
Integer division not resulting in a whole number will always yield a floating-point
number

In [111]:
3 / 2

1.5

#####
To get C-style integer division (which drops the fractional part if the result is not a whole number), use the floor division operator //.

In [112]:
3 // 2

1

#### Strings 

They can be written using either single quotes ' or double quotes " (double quotes are generally favored)

In [116]:
a = 'one way of writing a string'
b = "another way"

#####
For multiline strings with line breaks, you can use triple quotes, either ''' or """

In [114]:
c = """
This is a longer string that
spans
multiple lines
"""

#####
The string c actually contains four lines of text; the line breaks after """ and after lines are included in the string. We can count the new line
characters with the count method on c.

In [115]:
c.count("\n")

4

##### 
Python strings are immutable; you cannot modify a string.

In [117]:
a = "this is a string"

In [118]:
a[10] = "f"

TypeError: 'str' object does not support item assignment

#####
The error message above read from the bottom up. We tried to replace the character (the “item”) at position 10 with the letter "f", but this is not allowed for string objects. If we need to modify a string, we have to use a function or method that creates a new string, such as the string replace method.

In [119]:
b = a.replace("string", "longer string")

In [120]:
b

'this is a longer string'

#####
After this operation, the variable "a" is unmodified.

In [121]:
a

'this is a string'

#####
Many Python objects can be converted to a string using the str function.

In [122]:
a = 5.6

In [123]:
s = str(a)

In [124]:
print(s)

5.6


#####
Strings are a sequence of Unicode characters and therefore can be treated like other sequences, such as lists and tuples.

In [125]:
s = "python"

In [126]:
list(s)

['p', 'y', 't', 'h', 'o', 'n']

In [None]:
s[:3] # This slicing operation returns the first three characters of the string

'pyt'

In [None]:
s[3:] # This slicing operation returns the characters from position 3 to the end of the string

'hon'

In [131]:
s = "12\\34" # This is a string with an escaped backslash (meaning that it is used to specify special characters like newline \n or Unicode characters)

In [132]:
print(s)

12\34


#####
If you have a string with a lot of backslashes and no special characters, you might find this a bit annoying. Fortunately you can preface the leading quote of the string with r, which means that the characters should be interpreted as is.

In [135]:
s = r"this\has\no\special\characters" # r stands for raw string, meaning that the characters should be interpreted as is.

In [134]:
s

'this\\has\\no\\special\\characters'

#####
Adding two strings together concatenates them and produces a new string.

In [136]:
a = "this is the first half "

In [137]:
b = "and this is the second half"

In [138]:
a + b

'this is the first half and this is the second half'

#####
String objects have a format method
that can be used to substitute formatted arguments into the string, producing a new string.

In [139]:
template = "{0:.2f} {1:s} are worth US${2:d}"

#####
In this string:

• {0:.2f} means to format the first argument as a floating-point number with two decimal places.

• {1:s} means to format the second argument as a string.

• {2:d} means to format the third argument as an exact integer.

To substitute arguments for these format parameters, we pass a sequence of arguments to the format method.


In [140]:
template.format(88.46, "Argentine Pesos", 1)

'88.46 Argentine Pesos are worth US$1'

#####
The f-strings(formatted string literals) makes creating formatted strings even more convenient.

The f-string is created when the character f is written immediately preceding a string literal.

Within the string, enclose Python expressions in curly braces to substitute the value of the expression into the formatted string:


In [141]:
amount = 10

In [142]:
rate = 88.46

In [143]:
currency = "Pesos"

In [144]:
result = f"{amount} {currency} is worth US${amount / rate}"

#####
Format specifiers can be added after each expression using the same syntax as with the string templates above.

In [145]:
f"{amount} {currency} is worth US${amount / rate:.2f}"

'10 Pesos is worth US$0.11'

#### Bytes and Unicode

Bytes and Unicode are fundamental concepts for handling text and binary data in Python.

Unicode is a standard for representing text characters from all languages. In Python, strings are Unicode by default.

Bytes represent raw binary data - sequences of integers from 0-255. They're used for files, network communication, and when you need to store text in a specific encoding.


In [146]:
# Unicode strings 
text = "Hello, 世界!"  # Mix of English and Chinese characters
emoji_text = "Python is fun! 🐍✨"
arabic_text = "مرحبا بالعالم"

print(text)        # Output: Hello, 世界!
print(emoji_text)  # Output: Python is fun! 🐍✨
print(arabic_text) # Output: مرحبا بالعالم

# Unicode code points
print(ord('A'))    # Output: 65 (Unicode code point for 'A')
print(chr(65))     # Output: A (character from code point)
print(ord('世'))   # Output: 19990 (Unicode code point for Chinese character)

Hello, 世界!
Python is fun! 🐍✨
مرحبا بالعالم
65
A
19990


In [147]:
# Creating bytes
binary_data = b"Hello"  # Bytes literal (only ASCII characters)
print(binary_data)      # Output: b'Hello'
print(type(binary_data)) # Output: <class 'bytes'>

# Bytes from integers
byte_array = bytes([72, 101, 108, 108, 111])  # ASCII values for "Hello"
print(byte_array)  # Output: b'Hello'

# Individual byte access
print(binary_data[0])  # Output: 72 (ASCII value of 'H')

b'Hello'
<class 'bytes'>
b'Hello'
72


In [148]:
val = "español"

In [149]:
val

'español'

#####
The Unicode string is converted to its UTF-8 bytes representation using the encode method.

In [150]:
val_utf8 = val.encode("utf-8")

In [151]:
val_utf8

b'espa\xc3\xb1ol'

In [152]:
type(val_utf8)

bytes

#####
Assuming you know the Unicode encoding of a bytes object, you can go back using
the decode method.

In [153]:
val_utf8.decode("utf-8")

'español'

#####
While it is now preferable to use UTF-8 for any encoding, for historical reasons you may encounter data in any number of different encodings.

In [154]:
val.encode("latin1")

b'espa\xf1ol'

In [155]:
val.encode("utf-16")

b'\xff\xfee\x00s\x00p\x00a\x00\xf1\x00o\x00l\x00'

In [156]:
val.encode("utf-16le")

b'e\x00s\x00p\x00a\x00\xf1\x00o\x00l\x00'

#####
It is most common to encounter bytes objects in the context of working with files, where implicitly decoding all data to Unicode strings may not be desired.

#### Booleans

They include two values True and False.

Comparisons and other conditional expressions evaluate to either True or False.

They are combined with the "and" and "or" keywords.

In [157]:
True and True

True

In [158]:
False or True

True

#####
When converted to numbers, False becomes 0 and True becomes 1.

In [159]:
int(False)

0

In [160]:
int(True)

1

#####
The keyword not flips a Boolean value from True to False or vice versa.

In [161]:
a = True

In [162]:
b = False

In [163]:
not a 

False

In [164]:
not b

True

#### Type Casting

This is the process of converting a value from one data type to another in Python.

The str, bool, int, and float types are also functions that can be used to cast values to those types.

In [165]:
s = "3.14159"

In [166]:
fval = float(s)

In [167]:
fval = float(s)

In [168]:
int(fval)

3

In [169]:
bool(fval)

True

In [170]:
bool(0)

False

#####
NB: Most nonzero values when cast to bool become True.

#### None

This is Python's special value that represents "nothing" or "no value." It's a singleton object of the NoneType class and is Python's equivalent to null in other programming languages.

In [171]:
a = None

In [172]:
a is None

True

In [173]:
b = 5

In [174]:
b is not None

True

#####
None is also a common default value for function arguments

In [175]:
def add_and_maybe_multiply(a, b, c=None):
    result = a + b

    if c is not None:
        result = result * c
    return result    

#### Dates and Times
This is an built-in module that provides datetime, date, and time types. The datetime type combines the information stored in date and time and is the most commonly used.

In [176]:
from datetime import datetime, date, time

In [180]:
dt = datetime(2011, 10, 29, 20, 30, 21)

In [181]:
dt.day

29

In [182]:
dt.minute

30

#####
Given a datetime instance, you can extract the equivalent date and time objects by calling methods on the datetime of the same name.

In [183]:
dt.date()

datetime.date(2011, 10, 29)

In [184]:
dt.time()

datetime.time(20, 30, 21)

#####
The strftime method formats a datetime as a string.

In [185]:
dt.strftime("%Y-%m-%d %H:%M:%S")

'2011-10-29 20:30:21'

In [186]:
dt.strftime("%Y-%m-%d %H:%M")


'2011-10-29 20:30'

#####
Strings can be converted (parsed) into datetime objects with the strptime function.

In [187]:
datetime.strptime("20091031", "%Y%m%d")

datetime.datetime(2009, 10, 31, 0, 0)

#####
When you are aggregating or otherwise grouping time series data, it will occasionally be useful to replace time fields of a series of datetimes—for example, replacing the minute and second fields with zero.

In [188]:
dt_hour = dt.replace(minute=0, second=0)

In [189]:
dt_hour

datetime.datetime(2011, 10, 29, 20, 0)

#####
Since datetime.datetime is an immutable type, methods like these always produce
new objects. So in the previous example, dt is not modified by replace.

In [190]:
dt

datetime.datetime(2011, 10, 29, 20, 30, 21)

In [191]:
dt2 = datetime(2011, 11, 15, 22, 30)

In [192]:
delta = dt2 - dt

In [193]:
delta

datetime.timedelta(days=17, seconds=7179)

In [194]:
type(delta)

datetime.timedelta

#####
The output timedelta(17, 7179) indicates that the timedelta encodes an offset of 17 days and 7,179 seconds.

Adding a timedelta to a datetime produces a new shifted datetime.

In [195]:
dt

datetime.datetime(2011, 10, 29, 20, 30, 21)

In [196]:
dt + delta

datetime.datetime(2011, 11, 15, 22, 30)

### Control Flow
This refers to the order in which your program executes statements and makes decisions.

#### if, elif, and else

The if statement checks a condition that, if True, evaluates the code in the block that follows.

In [197]:
x = -5
if x < 0:
    print("It's negative")

It's negative


#####
An if statement can be optionally followed by one or more elif blocks and a catchall else block if all of the conditions are False:

In [199]:
if x < 0:
 print("It's negative")
elif x == 0:
 print("Equal to zero")
elif 0 < x < 5:
 print("Positive but smaller than 5")
else:
 print("Positive and larger than or equal to 5")

It's negative


#####
If any of the conditions are True, no further elif or else blocks will be reached.
With a compound condition using and or or, conditions are evaluated left to right and will short-circuit.

In [200]:
a = 5; b = 7

In [201]:
c = 8; d = 4

In [202]:
if a < b or c > d:
    print("Made it")

Made it


#### for loops
These are for iterating over a collection (like a list or tuple) or an iterater. The standard syntax for a for loop is.

In [205]:
for value in collection:
# do something with value

SyntaxError: incomplete input (369470439.py, line 2)

##### 
One can advance a for loop to the next iteration, by skipping the remainder of the block, using the continue keyword.

Consider this code, which sums up integers in a list and skips None values.

In [206]:
sequence = [1, 2, None, 4, None, 5]
total = 0
for value in sequence:
    if value is None:
        continue
    total += value

#####
A for loop can be exited altogether with the break keyword. This code sums elements of the list until a 5 is reached.

In [207]:
sequence = [1, 2, 0, 4, 6, 5, 2, 1]
total_until_5 = 0
for value in sequence:
    if value == 5:
        break
    total_until_5 += value

#####
The break keyword only terminates the innermost for loop; any outer for loops will continue to run.

In [208]:
for i in range(4):
    for j in range(4):
        if j > i:
            break
        print((i, j))


(0, 0)
(1, 0)
(1, 1)
(2, 0)
(2, 1)
(2, 2)
(3, 0)
(3, 1)
(3, 2)
(3, 3)


#####
if the elements in the collection or iterator are sequences (tuples or lists, say), they can be conveniently unpacked into variables in the for loop statement.


In [211]:
for a, b, c in iterator:
    # do something


SyntaxError: incomplete input (226247590.py, line 2)

#### while loops
They specify a condition and a block of code that is to be executed until the condition evaluates to False or the loop is explicitly ended with break:


In [2]:
x = 256
total = 0
while x > 0:
    if total > 500:
        break
total += x
x = x // 2

KeyboardInterrupt: 

#### pass
This is the “no-op” (or “do nothing”) statement.

It can be used in blocks where no action is to be taken (or as a placeholder for code not yet implemented); it is required only because Python uses whitespace to delimit blocks.


In [3]:
if x < 0:
    print("negative!")
elif x == 0:
    # TODO: put something smart here
    pass
else:
    print("positive!")


positive!


#### range
This function generates a sequence of evenly spaced integers.

In [4]:
range(10)  # Generates numbers from 0 to 9

range(0, 10)

In [5]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

##### 
A start, end, and step (which may be negative) can be given

In [6]:
list(range(0, 20, 2))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [7]:
list(range(5, 0, -1))

[5, 4, 3, 2, 1]

#####
Rnage produces integers up to but not including the endpoint. A common use of range is for iterating through sequences by index.

In [8]:
seq = [1, 2, 3, 4]

In [9]:
for i in range(len(seq)):
    print(f"element {i}: {seq[i]}")


element 0: 1
element 1: 2
element 2: 3
element 3: 4


#####
While you can use functions like list to store all the integers generated by range in some other data structure, often the default iterator form will be what you want. This snippet sums all numbers from 0 to 99,999 that are multiples of 3 or 5.

In [10]:
total = 0

In [11]:
for i in range(100_000):
    # % is the modulo operator
    if i % 3 == 0 or i % 5 == 0:
        total += i

In [12]:
print(total)

2333316668


#####
While the range generated can be arbitrarily large, the memory use at any given time may be very small.