# Introduction to Jupyter notebook and Python primitive types 
## Author: Erika Duan

![](../02_figures/00_jupyter-notebook-header.jpg)

# Accessing and navigating the operating system   

Before we begin to work with files, we must be able to navigate our computer's file system.   

## Using the `os` module

The `os` module is useful for providing cross-platform file management and contains commands like:  

+ `os.getcwd()` to get the current working directory. 
+ `os.chrdir()` to set the working directory to another path name.  
+ `os.path.basename()` to get the last component of a path name (i.e. the file name). 
+ `os.path.exists()` to check if a path exists (it will return True or False).  
+ `os.path.isdir()` to check if a path is a file directory. 
+ `os.path.isfile()` to check if a path is a file.
+ `os.path.join('rootdirectory', 'folder', 'filename')` to print a valid path for your operating system. 

In [1]:
#-----example 1-----
import os 

my_working_dir = os.getcwd()
os.path.isdir(my_working_dir)

True

In [2]:
#-----example 2-----
os.path.join("Users", "user", "Documents", "Python_tutorials") # prints os compatible file paths  

'Users\\user\\Documents\\Python_tutorials'

## Using the `pathlib` module  

The `pathlib` module was created in Python $\geq$ 3.4 and provides a less verbose alternative to locating and setting file paths than the `os` module. An important concept is its `pathlib.Path` class, which can be broken down into `.name`, `.parent`, `.stem`, `.suffix` and `.anchor` components and combined with functions like:  

+ `pathlib.Path.cwd()` to get the current working directory.  
+ `pathlib.Path(r'')` to manually specify a path input.   
+ `.joinpath('folder', 'subfolder')` to join parts of a path together.  

**Note**: A guide to using `pathlib` functions is found [here](https://realpython.com/python-pathlib/).   

In [3]:
#-----example 1------
import pathlib  

print(pathlib.Path.cwd()) # retrieve working directory  
type(pathlib.Path.cwd())

C:\Users\user\Desktop\Introduction-to-Python\03_notebooks


pathlib.WindowsPath

In [4]:
#-----example 2-----
alt_path = pathlib.Path(r'C:\Users\user\Desktop') # r'' conveys that string literals will be parsed
alt_path.joinpath('Introduction-to-Python', '03_notebooks') 

WindowsPath('C:/Users/user/Desktop/Introduction-to-Python/03_notebooks')

In [5]:
#-----example 3----- 
notebook_path = pathlib.Path.cwd().joinpath('00_Jupyter-notebook-and-Python-primitive-types.ipynb')
print(notebook_path)

print(notebook_path.name)
print(notebook_path.suffix)
print(notebook_path.stem)

print(pathlib.Path.cwd().parent)
print(pathlib.Path.cwd().parent.parent)

C:\Users\user\Desktop\Introduction-to-Python\03_notebooks\00_Jupyter-notebook-and-Python-primitive-types.ipynb
00_Jupyter-notebook-and-Python-primitive-types.ipynb
.ipynb
00_Jupyter-notebook-and-Python-primitive-types
C:\Users\user\Desktop\Introduction-to-Python
C:\Users\user\Desktop


In [6]:
#-----example 4-----  
[f.stem for f in pathlib.Path.cwd().iterdir()] # list all file names

['.ipynb_checkpoints',
 '00_Jupyter-notebook-and-Python-primitive-types',
 '01_Writing-conditions-and-for-and-while-loops-in-Python',
 '02_Writing-functions-in-Python',
 '03_Python-data-structures',
 '04_Manipulating-Pandas-DataFrames',
 '07_NLTK-analysis']

In [7]:
#-----example 5-----
# .iterdir() iterates over all files in a given directory  
import collections
collections.Counter(f.suffix for f in pathlib.Path.cwd().iterdir()) # count all file types

Counter({'': 1, '.ipynb': 6})

# Navigating Jupyter notebooks

Jupyter notebooks are a versatile way to document and run Python code.   
A notebook consist of many individual cells (similar to individual chunks found in an R markdown file).   

You can:  

+ Toggle between edit (`enter key`) or command (`esc key`) mode.
+ Toggle between code (`y key`) or markdown (`m key`) format whilst in edit mode.
+ Create new cells using `a | b keys` and navigate through them using `shift + enter keys` or `up | down keys`.
+ Use `control + enter keys` to run cells.  
+ During edit mode, use `x | c | v | d keys` to cut, copy, paste or duplicate individual cells.  
+ Each cell only prints its last output, unless objects are specified using `print()`.    

Jupyter notebooks can display LaTeX in markdown cells via `$LaTeX$` (displayed in-line) or `$$LaTeX$$` (displayed on a new line) i.e. $x^2 + y^\frac{1}{2}$. 

In [8]:
#-----example 1-----
x = [1, 2, 3] # use square brackets to store data in a list
x 

y = [4, 5, 6]
y # each cell only prints the last output

[4, 5, 6]

In [9]:
#-----example 2-----
x = [1, 2, 3] 
print(x) 

y = [4, 5, 6]
y # both objects are printed 

[1, 2, 3]


[4, 5, 6]

# Assigning and deleting object references    

In Python, we use `=` to assign a name to an operation or object in memory.     
This means `b = a` creates two references pointing to the same object in memory i.e. when `b` is modified, so is `a`. 

**Note:** This contrasts to R, which exhibits copy-on-modify behaviour so object `b` points to a new object in memory stored separately to the location of object `a` when it is modified.      

When assigning variables, you must follow some rules:  
+ Names must start with a letter (`a-z, A-Z`) or underscore (`_`) and can be followed by any number of letters, digits (`0-9`), or underscores (`_`).
+ Names cannot be the same as reserved words (i.e. False, True, None, And, If, For, While). 
+ Names are case-sensitive: 'YOU', 'you', 'You', and 'yOu' are all different names in Python.
+ Operator symbols also cannot be used.  

**Note:** To make a copy of an object, use `b = a.copy()` or `b = a[:]` instead. 

In [10]:
#-----example 1-----
x = [1, 2, 3] 

del x # deletes reference to the object in memory
# print(x) now creates an error   

In [11]:
#-----example 2-----
a = [1, 2, 3, 4, 5]
b = a 
b[0] = 5 

print(a) # a is also modified as a and b point to the same object in memory

[5, 2, 3, 4, 5]


In [12]:
#-----example 3-----
a = [1, 2, 3, 4, 5]
b = a.copy() 
b[0] = 5 

print(a) # a is not modified as b is a copy of the object a

[1, 2, 3, 4, 5]


# Getting help  

You can find help with functions by typing `help(function)` or `?function`.    

![](../02_figures/00_function-help-in-jupyter-notebook.jpg)  


**Note:** This feature only works on functions and not on methods or operators (a universally shared set of common methods).  

# Printing statements  

Use the `print("string and {conditional string}".format(conditional string))` format to print statements.  
When printing multiple conditions, the position of the condition specified in `.format(x, y, z)` can be referenced numerically as `{1} {2} {3}`. 

In [13]:
#-----example 1-----
name = "Erika"
first_language = "R"
year_R = 2018

print("My name is {0} and I started learning {1} in {2}. I like {1} and Python!".format(name, first_language, year_R))

My name is Erika and I started learning R in 2018. I like R and Python!


# Variable scope   

Variables can have local or global scope:  

+ Local scope refers to a variable that is only available within a single block of code (i.e. local scopes are found inside user-created functions).   
+ Local variables created inside a user-defined function are deleted when the function returns an output.   
+ Global scope refers to a variable that is available to any block of code once it is created.    
+ Referencing global variables should be avoided inside user-defined functions (you have no control over whether the global variable has changed).   

![](../02_figures/00_python-primitive-types-header.jpg)

# Introduction to Python primitive types   

Primitive types in Python consist of :  

+ Strings (i.e. characters) 
+ Integers (i.e. whole numbers)
+ Floats or doubles (i.e. decimal numbers)
+ Complex numbers (i.e. 4-3j) 
+ Boolean values (i.e. True or False)  
+ None type (NoneType contains a `none` value similar to `NA` in R. All user-defined methods return a `none` type unless explicitly defined.)

A conversion hierarchy also exists (i.e. boolean values can be converted into integers or floats or strings, but strings cannot be converted into integers, floats or boolean values).  

## Integers and floats  
Integers can be converted into floats using `float()` whilst floats are floored when converted into integers using `int()`.   

Mathematical operations can be performed on integers and floats and include:  

+ Arithmetic operations like `+`, `-`, `*`, `/`, `//` (divide and floor), `**` (exponential) and `%` (modulus).  
+ Assignment operations like `+=`, `-=`, `*=`, `/=`, `//=` etc.
+ Comparisons i.e. `==` (equal), `!=` (not equal), `>` (greater), `>=` (greater or equal to) etc.   

**Note**: The operator precedence is `**`, `%`, `//`, `/` or `*` and then `-` or `+`. 

In [14]:
#-----example 1-----
int(3.0) # convert a float into an integer

3

In [15]:
#-----example 2-----
float(4000) # convert a float into an integer

4000.0

In [16]:
#-----example 3-----
int(3.4), int(3.6) # floats are floored not rounded before integer conversion

(3, 3)

In [17]:
#-----example 4-----  
10 // 3, 10 % 3   

# % calculates the remainder following division by the nearest whole number
# // calculates what the nearest whole number divider is 

(3, 1)

In [18]:
#-----example 5-----
(8 * 3 % 3), (8 + 3 % 3) # note % has precedence over +   

(0, 8)

In [19]:
#-----example 6-----
int(True) # you can convert boolean values into integers i.e. True == 1, False == 0

1

## Strings

Strings are used to store characters and are declared inside quotes i.e. `"string"` or `'string'` similar to R.      
Strings can also be combined (i.e. addition and multiplication works on strings), although it is safer to join strings using `.join()` and split strings using `.split()`.  

Strings can also be accessed using an integer and 0-based index for subsetting.    
However, strings are also immutable objects (i.e. you cannot change the string in place with a command like `string[0] = "new character"`).   

**Note:** Numerical values can be coerced into strings using `str()`. This might be useful for numerical IDs that should be treated as character strings. Otherwise, avoid this behaviour as numerical operations cannot be performed on character strings.  

In [20]:
#-----example 1-----
my_name = "Erika" + " " + "M" + " " + "Duan" # manually add spaces via " "
my_name 

'Erika M Duan'

In [21]:
#-----example 2-----  
my_name[-4:] # extract the last four characters in the string

'Duan'

In [22]:
#-----example 3-----
alternates = "abcabcabc" 
alternates[0::3] # extract the first character in multiples of 3

'aaa'

In [23]:
#-----example 4-----
alternates[::-1] # recurses from the last character position

'cbacbacba'

In [24]:
#-----example 5-----
Gollum_says = "my" + " " + "precious" + "s" * 5 + "!"
Gollum_says

'my precioussssss!'

In [25]:
#-----example 6-----
numerical_string = str(400)
# numerical_string + 1 produces an error 

print(numerical_string + "1")
# numerical operations on numerical strings produce undesirable results

4001


Useful methods for manipulating strings can be found [here](https://docs.python.org/3/library/stdtypes.html#string-methods) i.e. `.capitalize()` and `.title()` for changing character case, `.startswith()` and `.endswith()` for extracting a subset of strings, `.strip()` for stripping extra white space and etc.       

In [26]:
#-----example 1-----
("hello word!").title()

'Hello Word!'

In [27]:
#-----example 2-----
("Singapore").startswith("S")

True

In [28]:
#-----example 3----- 
string = "-".join(["a", "b", "c", "d", "e"])
string

'a-b-c-d-e'

In [29]:
#-----example 4-----
string.split("-") # split string into a list of strings  

['a', 'b', 'c', 'd', 'e']

In [30]:
#-----example 5-----
(" ").join("   Student  ID 1".split())

# remove all unnecessary whitespace by first splitting the string into individual words
# then rejoin words with " " as the separator  

'Student ID 1'

# Introduction to object-oriented programming  

A class can be thought of as a blueprint for creating new objects that have specific dimensions and methods assigned to them.  

The word `self` is used to represent an instance of the class, usually in reference to defining new object dimensions or methods. A more detailed explanation can be found [here](https://stackoverflow.com/questions/625083/what-init-and-self-do-on-python).      

In [31]:
#-----example 1----- 
class BasicRectangle: # define BasicRectangle class
    width = 0
    height = 0
    
    def area(self): # define function area with respect to self
        return self.width * self.height  

# create a new BasicRectangle object 

rectangle_a = BasicRectangle
rectangle_a.width = 10
rectangle_a.height = 4
rectangle_a.area(rectangle_a) 

# a coding improvement is to use def __init__ to pre-define the behaviour of your object 

40

In [32]:
#-----example 2-----
class BasicRectangle: 
    
    def __init__(self, width = 0, height = 0): # define the constructor method
        self.width = width # linking self.width to the width argument above  
        self.height = height
        
    def area(self): 
        return self.width * self.height 
    
    def perimeter(self):
        return 2 * self.width + 2 * self.height
    
# create a new BasicRectangle object and set width and height in one function  

rectangle_b = BasicRectangle(width = 4, height = 3)
rectangle_b.area(), rectangle_b.perimeter()

(12, 14)

In [33]:
#-----example 3-----  
class BasicRectangle:  
    
    def __init__(self, width = 0 , height = 0):
        self.set_width(width = width) 
        self.set_height(height = height)
        
    def set_width(self, width): # allows methods to be called from inside the class  
        self.width = width
        
    def set_height(self, height): 
        self.height = height
        
    def area(self):
        return self.width * self.height  
    
# create a new BasicTectangle object 
# apply the method set_width to change an object property   
# apply the method area()  

rectangle_c = BasicRectangle(width = 10, height = 2) 
print(rectangle_c.area()) # 10 * 2 = 20

rectangle_c.set_width(20)
rectangle_c.area() # 20 * 2 = 40

20


40

# Watch out for the 3 pet peeves of Python!  

1. The first object in a list is stored in position `[0]` whilst the last object is stored in position `[-1]`.
2. To slice from a list, always slice between `[a:b+1]` to extract values a to b.    
3. The code `list2 = list1` merely creates a second reference pointing to the same list i.e. object. Use `list2 = list1[:]` or `list2 = list1.copy()` to create a new reference to a duplicate object instead.  