<div class="alert alert-block alert-info">
Author:<br>Felix Gonzalez, P.E. <br> Adjunct Instructor, <br> Division of Professional Studies <br> Computer Science and Electrical Engineering <br> University of Maryland Baltimore County <br> fgonzale@umbc.edu
</div>

This notebook provides an overview of some basic concepts in Python Programming Language and Jupyter Notebooks. Python is an object-oriented, high-level programming language with integrated dynamic semantics primarily for web and app development. Is relatively simple, it's easy to learn since it requires a unique syntax that focuses on readability. Supports the use of modules and packages, programs can be designed in a modular style and code can be reused across a variety of projects. Other characteristics include:

- Open source/free: no need to worry about licences,
- Cross-platform: Can be used with Windows/Macs OS/Linux – even Android and iOS!
- Full-featured Packages: If there’s something you want to do, there’s probably a package out there to help you
- Code Portability: With most code, it can run unaltered on a plethora of computers so long as all the required modules are supplied
- Large and Growing Community: People from all fields from Data Science to Physics are coding in Python, creating a diverse and rich community of experts all over.

Jupyter Notebooks (https://jupyter.org/) is a graphical user interface that runs in a web browser and can run Python and other programming languages. Is a tool used in data science to perform and document data analyses. It helps in running code within cells, documenting results, sharing code and results, among other things.

Python Documentation canb e found at https://docs.python.org. Python community also maintains a proposed enhanced proposal at https://peps.python.org/.

# Table of Contents
[Tips and Shortcuts](#Tips-and-Shortcuts)

[Objects](#Objects)

[Functions](#Functions)

[Jupyter Notebook Magic Commands](#Jupyter-Notebook-Magic-Commands)

[Variables](#Variables)

[Variable Naming Conventions and Naming Styles](#Variable-Naming-Conventions-and-Naming-Styles)

[Operations with Variables](#Operations-with-Variables)

[CASTING: Specifying Data Type](#CASTING:-Specifying-Data-Type)

[Working With Strings](#Working-With-Strings)

[Command-line String Input() Function](#Command-line-String-Input()-Function)

[Arithmetic Operators](#Arithmetic-Operators)

[Lambda Expressions](#Lambda-Expressions)

[Conditions, Logical Operators, If statements and Loops](#Conditions,-Logical-Operators,-If-statements-and-Loops)

[Modules, Packages, and Libraries](#Modules,-Packages,-and-Libraries)

[Modules: Importing](#Modules:-Importing)

[Modules: Installing](#Modules:-Installing)

# Tips and Shortcuts
[Return to Table of Contents](#Table-of-Contents)

Depending on the platform (e.g., Jupyter Notebook, Windows, Mac, Linux, Google Colab, VS Studio Code, Anaconda Cloud, etc.) that you are opening the Jupyter Notebook, some shortcuts may vary.

#### Important Tips:
- There are various types of Cells in Jupyter Notebooks with the most used being:
    - "Markdown" cell you can put text with explanations
    - "Code" cell allows you to write Python code and run it.
- To edit a Jupyter Notebook cell double click.
- Be careful with quotes especially when handling Text and Strings. In Python dobule quotes "" or single quotes ' ' can be used in various functions. However, they need to be alternated when they are used within a function that requires them. More on this later. 
- <b><u>DOCUMENTATION</u></b> of a module/library/packages will be extremely important. During the class we will be using various Python libraries and accessing the documentation to learn about the included functions and parameters. The base Python documentation can be found at: https://docs.python.org/3/contents.html.
- Python Style Guide: https://peps.python.org/pep-0008/

#### Most Useful Jupyter Notebook Shortcuts
- Shift+Tab: Access <b><u>documentation</u></b> of a function. 
- Ctrl+S: Saves and creates checkpoint of the notebook.
- Shift+enter: Runs a cell.

#### Accessing Documentation:
Documentation can be accessed in various ways that will be discussed in more detail during the class.
- The documentation of a Module/Library/Package and their functions is published by the developer in their website. Note that <b><u>documentation</u></b> quality may vary from library to library.
- Shift+Tab with the coursr placed inside the function parenthesis.
- help() function, with the name of the funciton inside the parenthesis.


####  Commenting
Allows you to add a comment within a Code cell. The comment does not run code and is used for documentation purposes.

In [1]:
# In a cell, when you put the "#" character, the rest is accepted as comment
# This is a comment
# This is another comment.

In [2]:
# If you are going to write a long comment, you don't have to put # in the beginning of each line.
# you can use the following option (called "docstring")
# which is also very useful if you are going to copy and paste
# some long text
'''
This text is in a comment 
So is this text
Text text text text
Comment comment comment
'''

'\nThis text is in a comment \nSo is this text\nText text text text\nComment comment comment\n'

# Objects
[Return to Table of Contents](#Table-of-Contents)

Objects are chunks of code that are wrapped up in a particular way. One thing this format enables is the attaching of labels to them to create variables. Objects can have their own functions and variables, so you can have variables inside other variables. Objects generally do some particular job. Objects can be lists, dictionaries, functions, instance of classes, etc.

References:
- https://docs.python.org/3/tutorial/classes.html#a-word-about-names-and-objects

# Functions
[Return to Table of Contents](#Table-of-Contents)

Functions are code that perform a task. The function round(number, ndigits=None) has two parameters, (1) number and (2) ndigits. The "number" is the number that we wish to round while the ndigits is the precision in decimal digits. See below for example on the round() function. When analyzing data, the main goal is to be able to select the needed combination of functions that allows us to clean and transform the data, show visualizations, run a model and other tasks.

List of the base Python built in functions can be found at: https://docs.python.org/3/library/functions.html.

In [3]:
round(number = 2.432442, ndigits = 2)
# Putting the cursor within the parenthesis of the function and Shift+Tab shows the function documentations. 

2.43

In [4]:
# Note that the parameters can be ommited and the function will work as long as the values are in the required location.
round(2.432442, 2)

2.43

In [5]:
help(round) # The help function is similar to the "Shift-Tab" but prints the documentation. 

Help on built-in function round in module builtins:

round(number, ndigits=None)
    Round a number to a given precision in decimal digits.

    The return value is an integer if ndigits is omitted or None.  Otherwise
    the return value has the same type as the number.  ndigits may be negative.



# Jupyter Notebook Magic Commands
[Return to Table of Contents](#Table-of-Contents)

Jupyter Notebooks have specific "Magic" commands that perform a set of tasks and only work within the Jupyter Notebook environment. Magic commands are commands that are commonly used and may require multiple lines of code and functions to develop in Python code. 

The command start with '%' called Line Magics for single line commands or '%%' called Cell Magics for multi-line commands. The list of all magic commands can be found at: https://ipython.readthedocs.io/en/stable/interactive/magics.html. Most common Magic Commands include:
- %lsmagic 
- %%time 
- %who 
- %who str
- %who int
- %pinfo
- %env
- %run
- %load
- %%writefile
- %matplotlib inline
- %matplotlib widget

The cells below provide some examples on using some of the magic commands.

In [6]:
%lsmagic
# List of available magic commands.

Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cd  %clear  %cls  %code_wrap  %colors  %conda  %config  %connect_info  %copy  %ddir  %debug  %dhist  %dirs  %doctest_mode  %echo  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %macro  %magic  %mamba  %matplotlib  %micromamba  %mkdir  %more  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %ren  %rep  %rerun  %reset  %reset_selective  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %uv  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%cmd  %%code_wrap  %%debug  %%file  %%html  %%javascript  %%js  %%l

In [7]:
%%time
# Note that comments need to be put after the cell magic command.
# In some cases we want to see how long it takes a code block to run.
# When comparing various code blocks time to run is called benchmarking.
# Benchmarking can help places where your code block takes too long and may provide opportunities for improvement.
# %%time shows the amount of time a cell takes to run.
i = 0
# While loop iterates until the condition is met. More on this later.
while i < 1000000: # Add or remove a 0 to perform sensitivy analysis on how long the code takes to iterate.
    i = i + 1

CPU times: total: 78.1 ms
Wall time: 86.4 ms


In [8]:
# Definign variables for example below
variable_1 = 5 # Integer
variable_2 = 'hello' # Text String 
variable_3 = 50.1 # Float 
variable_4 = True # Boolean value

In [9]:
%who
# Shows a list of all defined variables. 
# Until this point in the notebook only variable "i", and variables 1 thru 4 have been defined.
# See next section for more information on variables.

dataframe_columns	 dataframe_hash	 dtypes_str	 get_dataframes	 getpass	 hashlib	 i	 import_pandas_safely	 is_data_frame	 
json	 variable_1	 variable_2	 variable_3	 variable_4	 


In [10]:
%who str
# Shows list of defined variables that are text string.
# Until this point in the notebook no string variables have been defined.

variable_2	 


In [11]:
%who int
# Shows list of defined variables that are integers.

i	 variable_1	 


In [12]:
%pinfo variable_4
# Prints information on the defined variable (i.e., variable 4).

[1;31mType:[0m        bool
[1;31mString form:[0m True
[1;31mDocstring:[0m  
Returns True when the argument is true, False otherwise.
The builtins True and False are the only two instances of the class bool.
The class bool is a subclass of the class int, and cannot be subclassed.

# Built-in Types/Data Types

There are various Built-in types that are built into the Python interpreter. Principal built-in types include numerics (e.g., integers, float, complex), sequences (e.g., list, range), mappings (e.g., dictionaries), text sequence (e.g., string), binary sequences (e.g., true/false), classes, instances, exceptions, among others. The documentation for built-in types can be found at: https://docs.python.org/3/library/stdtypes.html, and https://docs.python.org/3/library/datatypes.html.

Note that other libraries, including base Python function type(), refer to data types as numeric (integer, float, complex), boolean (true/false), string, and datetime as the main data types and lists and dictionaries as data collections. 

The function type() can be used to check a the data type of a variable. The example below uses the previously generated variables.

In [13]:
type(variable_1)

int

In [14]:
type(variable_2)

str

In [15]:
type(variable_3)

float

In [16]:
type(variable_4)

bool

# Variables
[Return to Table of Contents](#Table-of-Contents)

The following overview is for Base Python functions. List of Python built-in functions can be found at: https://docs.python.org/3/library/functions.html.

Unlike other programming languages, Python has no command for declaring a variable. Variables are the combination of an identifying label and a value, often a literal like 2 or "hello world". The label/identifier is attached to the value. A variable is created the moment you first assign a value to it. 

Variables are generally used to hold the result of calculations and user inputs. These are things we can't predict before the code is run. 

There is no need to declare any variable type before setting it. Python will detect the type (e.g., integer, float, string). They are called variables because you can change the value. Anything can be a variable in Python: 
- numbers, 
- strings, 
- functions,
- modules, 
- chunks of code, 
- etcetera. 

In [17]:
x = 1
y = 3.4
z = 'UMBC'
show = "Game of Thrones"
Answer1 = True
Answer2 = False

In [18]:
# I can access the variable with the print function.
print(x) # With the cursor in the parenthesis Try "shift+tab" to access the print function documentation.

1


In [19]:
# I can access the "x" value by putting:
x # We will later note that the output of the print() function is a Nonetype Object vs. in this case is an integer.

1

In [20]:
type(print(y))

3.4


NoneType

In [21]:
type(y)

float

In [22]:
print(z)

UMBC


In [23]:
z

'UMBC'

In [24]:
print(show)

Game of Thrones


In [25]:
print(Answer1)

True


In [26]:
print(Answer2)

False


Note that x is an integer, and y is a float. z and show variables are strings. Answer1 and Answer2 variables are Booleans. To check we can use the type() function.

Reference:
- Python Interpreter Types: https://docs.python.org/3/library/types.html#Standard%20Interpreter%20Types

In [27]:
type(print(x)) # Note that the output of print() function is a "Nonetype".

1


NoneType

In [28]:
type(x)

int

In [29]:
type(y)

float

In [30]:
type(z)

str

In [31]:
type(Answer1)

bool

In [32]:
# We can get pretty creative with the print function and also do various functions within a cell:
print(f'Note that x is an {type(x)}, y is a {type(y)}, z is a {type(z)} and Answer1 is {type(Answer1)}')
# "f" above is a Formatted String Line.
# Alternatively the print above can also be:
print('Note that x is an', type(x),', y is a', type(y),'z is a ',type(z),'and Answer1 is', type(Answer1), '\n')
# "\n" returns a new line.

# 'Not using the "f" treats it as a string:
print('Note that x is an {type(x)}, y is a {type(y)}, z is a {type(z)} and Answer1 is {type(Answer1)}')

Note that x is an <class 'int'>, y is a <class 'float'>, z is a <class 'str'> and Answer1 is <class 'bool'>
Note that x is an <class 'int'> , y is a <class 'float'> z is a  <class 'str'> and Answer1 is <class 'bool'> 

Note that x is an {type(x)}, y is a {type(y)}, z is a {type(z)} and Answer1 is {type(Answer1)}


Reference:
- Python Output (String Formatting): https://docs.python.org/3/tutorial/inputoutput.html

In [33]:
# Note the differences to using the print and formatting.
print(f'The value of x is {x}, y is {y}, z is {(z)} and Answer1 is {(Answer1)}') # Formatted string literals
print('The value of x is {0}, y is {1}, z is {2} and Answer1 is {3}'.format(x, y, z, Answer1)) # String format() method

The value of x is 1, y is 3.4, z is UMBC and Answer1 is True
The value of x is 1, y is 3.4, z is UMBC and Answer1 is True


In [34]:
# Applying specific formatting as defined.
print(f'The value of x is {x:.3f}, y is {y:.2f}, z is {(z).lower()} and Answer1 is {(Answer1)}')
print('The value of x is  {0} , y is {1}, z is {2} and Answer1 is {3}'.format(x, y, z.lower(), Answer1))
'The value of x is  {:9} , y is {:.2f}, z is {} and Answer1 is {}'.format(x, y, z.lower(), Answer1)

The value of x is 1.000, y is 3.40, z is umbc and Answer1 is True
The value of x is  1 , y is 3.4, z is umbc and Answer1 is True


'The value of x is          1 , y is 3.40, z is umbc and Answer1 is True'

In [35]:
# Variables do not need to be declared with any particular type and can even change type after they have been set.
x = 1
x = 3.4125415
x = "Data Science"
print(x)

Data Science


In [36]:
# Note on double quotes and single quotes. The first three examples will work.
print("That is my car.")
print('That is my car.')
print("That's my car.") # Note use of different quotes.

That is my car.
That is my car.
That's my car.


In [37]:
# This example will result in a SyntaxError. Uncomment and run to see.
print('That's my car.')

SyntaxError: unterminated string literal (detected at line 2) (704259835.py, line 2)

In [38]:
# Be careful with use on single and double quotes. When analyzing data this may cause issues.
# We will explore potential challenges with quotes and how to address them when analyzing text data.

# Variable Naming Conventions and Naming Styles
[Return to Table of Contents](#Table-of-Contents)

By now we have used and named various variables. A variable can have a short name (like x and y) or a more descriptive name (age, carname, total_volume). 

Rules for Python variables naming:<br>
[1] A variable name must start with a letter or the underscore character;<br>
[2] A variable name CANNOT start with a number;<br>
[3] A variable name can only contain alpha-numeric characters and underscores (A-z, 0-9, and _ );<br>
[4] Variable names are case-sensitive (age, Age and AGE are three different variables); and<br>
[5] Do not use function names as variable names.

There is no relationship between a variable's value or use and its name. Generally you should name variables so you can tell what they are used for. In general, the more meaningful your names, the easier it will be to understand the code, when you come back to it, and the less likely you are to use the wrong variable.

Style conventions aren't syntax, but allow all coders to recognise what an element is. There's a styleguide for Python at:
https://www.python.org/dev/peps/pep-0008/ But it goes out of its way to avoid talking about variable names. 

The community preference seems to be for lowercase words to be joined with underscores: <b>snake_case</b>, for example, perimeter_of_a_square

Though, where Python is working with C or other code, the more conventional <b>camelCase</b> is sometimes used, for example
perimeterOfASquare.

Either way, start with a lowercase letter, as other things start uppercase.

In [39]:
# Example of passing a function within a variable.
# In Python (but not all other languages), functions themselves are objects that can be given labels:
a = print
a("hello world")

hello world


This makes it incredibly powerful: for example, we can pass one function into another which is the core of functional programming.

The function dir() returns the list of names, valid attributes and functions for an object (e.g., variable).
- If the object is a module object, the list contains the names of the module’s attributes.
- If the object is a type or class object, the list contains the names of its attributes, and recursively of the attributes of its bases.
- Otherwise, the list contains the object’s attributes’ names, the names of its class’s attributes, and recursively of the attributes of its class’s base classes.

dir() function Documentation: https://docs.python.org/3/library/functions.html#dir

In [40]:
# We can learn the available functions for the string type.
# Dir returns attributes and methods (e.g., available functions) related to an object
# The ones starting and ending with "__" are "internal use" 
print(type(a)) # We can check what a is.
dir(a)

<class 'builtin_function_or_method'>


['__call__',
 '__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__name__',
 '__ne__',
 '__new__',
 '__qualname__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__self__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__text_signature__']

In [41]:
# Because a is defined as print both a and print should have the same type.
type(print)

builtin_function_or_method

In [42]:
print(x)
print(type(x)) # We can check what x is.
dir(x)

Data Science
<class 'str'>


['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'removeprefix',
 'removesuffix',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'stri

# Operations with Variables
[Return to Table of Contents](#Table-of-Contents)

In [43]:
# you can define multiple variables at once
y, z, asd = 100, 200.124, "Maryland"
print(y)
print(z)
print(asd)

100
200.124
Maryland


In [44]:
var1 = 'This is a String'

In [45]:
var2 = "This is also a String"  # Did you see the difference between this command and previouse one?

In [46]:
# what happens if we add them?
var1+var2

'This is a StringThis is also a String'

In [47]:
# what happens if we try this:
print(var1,'.',var2)

This is a String . This is also a String


In [48]:
var1 + '. ' + var2

'This is a String. This is also a String'

In [49]:
var1, var2 = 1, 2

In [50]:
var1 / var2   # Regular division

0.5

In [51]:
var1 // var2  # Integer division: drops the fraction

0

In [52]:
y = 100
x = y**2 # exponent 
print(x)

10000


In [53]:
y = 100.0
x = y**2 # exponent
print(x)

10000.0


Do you see the difference between the previous the output of the two cells above?

In [54]:
# int and float results in float
1/2.0   

0.5

In [55]:
#  Python can handle complex numbers as well
w = 3+4j
print(type(w))

<class 'complex'>


In [56]:
# Float can also be scientific numbers with an "e" to indicate the power of 10.
x = 1.1e3
print(x)

1100.0


# CASTING: Specifying Data Type
[Return to Table of Contents](#Table-of-Contents)

There may be times when you want to specify a type on to a variable. This can be done with casting. In short, this forces a variable to be specific type.

The function int() constructs an integer number from an integer literal, a float() literal (by rounding down to the previous whole number), or a string literal (providing the string represents a whole number).

In [57]:
x = int(1)   
y = int(2.8)
z = int("3") # Converts a string number to a integer. 
answer_1 = int(True)
answer_2 = int(False) # True and false are converted to 0 and 1 respectively.
print(x,y,z, answer_1, answer_2)

1 2 3 1 0


The function float() constructs a float number from an integer literal, a float literal or a string literal (providing the string represents a float or an integer)

In [58]:
x = float(1)    
y = float(2.8)   
z = float("3") # Converts a string number to a float.
w = float("4.2")
answer_1 = float(True)
answer_2 = float(False) # True and false are converted to 0.0 and 1.0 respectively.
print(x,y,z,w,answer_1, answer_2)

1.0 2.8 3.0 4.2 1.0 0.0


The Function str() constructs a string from a wide variety of data types, including strings, integer literals and float literals

In [59]:
x = str("s1") 
y = str(2)    
z = str(3.0)
w = str('Python Rocks')
print(x,y,z,w)

s1 2 3.0 Python Rocks


In [60]:
x = int('Python Rocks')
print(x)
# Note that there are some castings that will fail. Like trying to convert letters to a numeric type.
# This will give a ValueError.

ValueError: invalid literal for int() with base 10: 'Python Rocks'

When reading an error, start at the bottom statement and then go to the location being highlighted. In some examples there may be many more lines and the message at the end and highlighted location will help identify the issue in the majority of the error messages.

# Working With Strings
[Return to Table of Contents](#Table-of-Contents)

Note that Python stores strings as arrays of bytes representing unicode characters. There are various functions that can help us work with text strings this include print(), len(), variable_name[].

In [61]:
print(f'Recall that variable w is defined as "{w}".')

Recall that variable w is defined as "Python Rocks".


In [62]:
w

'Python Rocks'

In [63]:
# Let's look at "w" more carefully,
# Function lent prints the number of characters within the string.
len(w)

12

In [64]:
print(w[0]) # Python is a zero index code. First position starts at 0 not 1.

P


In [65]:
print(w[1]) # This prints the second character.

y


In [66]:
print(w[:]) # Prints all string would be the same as print(w).
# Recall that Print returns a noneobject.

Python Rocks


In [67]:
# In some cases we also want to select the partial substring and return in as a string object.
# The following code shows ways to select partial string within string.

w[:] # Selects everything

'Python Rocks'

In [68]:
w[-1]

's'

In [69]:
w[4:]

'on Rocks'

In [70]:
w[:5]

'Pytho'

In [71]:
print(w[6]) # What is character in position 6?

 


In [72]:
w[6]

' '

In [73]:
# There are so many great string functions in Python, 
# which will be so useful for Exploratory Data Analysis (EDA).
# For example: strip(), removes any whitespace from the beginning or the end.
Name1 = "Albert Einstein"
Name2 = " Albert Einstein"
Name3 = "  Albert Einstein   "
print(Name1.strip())
print(Name2.strip())
print(Name3.strip())

Albert Einstein
Albert Einstein
Albert Einstein


In [74]:
# The lower() method returns the string in lower case:
Name1 = "Albert Einstein"
print(Name1.lower())

albert einstein


In [75]:
# The upper() method returns the string in upper case:
Name1 = "Albert Einstein"
print(Name1.upper())

ALBERT EINSTEIN


In [76]:
Name1

'Albert Einstein'

In [77]:
Name1 = Name1.upper()

In [78]:
Name1

'ALBERT EINSTEIN'

In [79]:
# Another very useful method is replace()
# The replace() method replaces a string with another string:
prices = "$100, $200, $87, $500"
print(prices.replace("$", "€"))

€100, €200, €87, €500


In [80]:
prices = "$100, $200, $87, $500"
print(prices.replace("$", "", 2))

100, 200, $87, $500


In [81]:
# Another super powerful method is split
# split() method splits the string into substrings if it finds instances of the separator.
prices = "$100, $200, $87, $500"
print(prices.split(", "))
# Note the output here is new. Is a list of strings. We will be discussing about lists later on.

['$100', '$200', '$87', '$500']


In [82]:
A_sentence = "I will go shopping today.I am having dinner tomorrow"
print(A_sentence.split(" "))

['I', 'will', 'go', 'shopping', 'today.I', 'am', 'having', 'dinner', 'tomorrow']


In [83]:
# If you try to combine a string and a number, Python will give you an error:
# If you need to combine a string and a number, then you'll need to convert numbers into strings
temperature = 74
z = "Baltimore"
print(z, 'is', str(temperature), "degrees Fahrenheit today.")

Baltimore is 74 degrees Fahrenheit today.


In [84]:
temperature + z # See type error as matehmatical operation cannot be performed between number and string
# There may be data transformations that allow such operations.

TypeError: unsupported operand type(s) for +: 'int' and 'str'

# Command-line String Input() Function
[Return to Table of Contents](#Table-of-Contents)

In [85]:
# Python allows for command line input. That means we are able to ask the user for input.
# What we need is the input() method
x = input(prompt = 'Enter your name and press "Enter/Return" on keyboard: ')

Enter your name and press "Enter/Return" on keyboard:  Johnny


In [86]:
print("Hello", x)

Hello Johnny


# Arithmetic Operators
[Return to Table of Contents](#Table-of-Contents)

Python has many operators. This include those used for arithmetic operations.

In [87]:
x = 12.0
y = 2
addition = x+y
subtraction = x-y
multiplication = x*y
exponentitation = x**y
division = 12.4/y
floor_division = 12.4//y
modulus = 10%3
modulus2 = 10%3.0
print(addition, subtraction, multiplication, exponentitation, division, floor_division, modulus, modulus2)
# Note the difference between division and floor division!

14.0 10.0 24.0 144.0 6.2 6.0 1 1.0


In [88]:
# Augmented Assignment Operators are very useful operators for counting, indexing, etc.

In [89]:
x += 1 # Same as: x = x + 1 
x

13.0

In [90]:
x -= 1 # same as: x = x – 1
x

12.0

In [91]:
x *= 3 # same as: x = x * 3
x

36.0

In [92]:
x /= 3 # same as: x = x / 3
x

12.0

In [93]:
x //= 3 # same as: x = x // 3
x

4.0

In [94]:
x %= 3 # same as: x = x % 3
x

1.0

# Lambda Expressions
[Return to Table of Contents](#Table-of-Contents)

Lambda is a keyword used to develop a small function and restricted to a single expression. They can be used with variables, lists, dataframes (to be discussed in the data analysis lecture), and other objects. Although very powerful, lambda statemetns sometimes can result in reduced code readability. A lambda expression has the following form:

    lambda x: "Some statement"

Documentation Reference: https://docs.python.org/3/tutorial/controlflow.html#lambda-expressions

In [95]:
x = 4
y = 3
number_list = [4, 6, 9, 10, 11, 24]

In [96]:
# Simple arithmetic operation.
(lambda x: x*2)(x = 5)

10

In [97]:
print(x) # Note x is still 4.

4


In [98]:
# Using a for loop to create a new list of squared values of a number list.
# pow Function: Applies the power of specified number.

number_list_squared = []
for element in number_list:
    number_list_squared.append(pow(element,2))
number_list_squared

[16, 36, 81, 100, 121, 576]

In [99]:
# ALternative using lambda expression.
# Map Function: Applies function to every item of an iterable and returns a list of the results.

number_list_squared = list(map(lambda x: pow(x,2), number_list))
number_list_squared

[16, 36, 81, 100, 121, 576]

In [100]:
# Another example using filtering.
# Creates list of numbers from the list that are greater than 5 after dividing by 2.
# Similarly a for loop could be created that does the same albeit a little more complicated.
number_list_filtered = list(filter(lambda x: x/2>=5, number_list))
number_list_filtered

[10, 11, 24]

# Conditions, Logical Operators, If statements and Loops
[Return to Table of Contents](#Table-of-Contents)

Python supports the usual logical conditions from mathematics.

## Logical Operators

Python has various logical operators that allow to compare variables, numbers, classes. These include but is not limited to AND (&), OR (|), NOT (using ~), ==, <, >, =<, =>, !=, among many others.

In [101]:
print(f'Recall the value of x = {x} and the value of y = {y}')

Recall the value of x = 4 and the value of y = 3


In [102]:
# For example:
x == y

False

In [103]:
x != y

True

In [104]:
1 > 4

False

## Basics of a Script: Indentation
Python uses indents to indicate blocks of code – no brackets. This section discusses indentation on various control flow and compound statements such as "If" Statements, loops (e.g., "for" loops, "while" loops). Other compound statements include "match" statements, "with" statements and "try" statements. These statements, especially the loops are the backbone of automation.

Documentation References:
- https://docs.python.org/3/reference/compound_stmts.html
- https://docs.python.org/3/tutorial/controlflow.html

## If Statements
If statements are used for handling decisions in comparisons, defined conditions or actions.

Documentation References:
- https://docs.python.org/3/tutorial/controlflow.html#if-statements

In [105]:
if x > y:
    print("x is greater than y.")
# If true will print the statement.

x is greater than y.


In [106]:
x = 4
y = 26

In [107]:
if x > y: # In this case depending on the values of x and y selected above meets the condition.
    print("x is greater than y.")
    if x == 4: # Only enterst this statement if the previous if condition is met and this condition is met.
        print('Yes, x equals 4') # Only prints if x > y and x == 4.
    print('True x is greater than y') # Prints if x > y and either x == 4 or x != 4.
    
print('Yes it worked.') # This print function is outside of the two if statements. It always prints.

Yes it worked.


In [108]:
if 4 < 6:
    print("The statement is true.")

The statement is true.


In [109]:
# Note in previous versions of Python Indexing errors where highlighted in red but code would still run.
# In the latest release at the beginning of Fall 2024 Semester it gives an IndentationError
if 4 < 6:
   print("The statement is true.")
    print("Completed.")

IndentationError: unexpected indent (2454504753.py, line 5)

In [110]:
# In the previous examples if the condition is not met it continues with the code lines below.
# In the if/else statement below prints one result else prints another result. 
if x > y:
    print("x is greater than y.")
else:
    print("x is NOT greater than y.")

x is NOT greater than y.


## Loops
There are two main types of loops:
- For loops or For Statements: iterate thru a known number of iterable items.
- While loops or While Statements: iterates until certain condition is met.

When defining loops we tend to use i, j, and k for counters because in the first third generation language, FORTRAN, letters had to be used for certain variables, and certain types of number could only be assigned to certain letters. Integer values (whole numbers) could only be assigned to the letters from i (for integer) onwards. For this reason, people still use i, j, k for counting things (but not lowercase "L" -- it looks too much like a one).

Documentation References:
- https://docs.python.org/3/tutorial/controlflow.html#for-statements
- https://docs.python.org/3/reference/compound_stmts.html#the-for-statement
- https://docs.python.org/3/reference/compound_stmts.html#the-while-statement

In [111]:
# Indentation example
x = 1
y = 2
for i in (1, 2, 3):
    x = x + i
    y = y + i
    if x > y:
        print('Something is wrong!')
    print(x, y)
print('All done.')

2 3
4 5
7 8
All done.


# Modules, Packages, and Libraries
[Return to Table of Contents](#Table-of-Contents)

Python has various modules, packages and libraries (terms used interchangeably) that add further capabilities to the base Python functions. The modules are typically developed by a topic. Example of libraries for math and statistics include numpy and stats, libraries for data analysis include Pandas, libraries for data visualization include matplotlib and seaborn, machine learning libraries include scikit learn, and so on. In many cases some libraries may have overlap in the capabilities.

Anaconda Distribution already includes installation of many Python modules, packages and libraries, especifically those related to data science. The full list can be found at https://docs.anaconda.com/anaconda/packages/pkg-docs/. Note that it varies per version of Anaconda Distribution.

Functions most of the time are loaded at the beginning of a Jupyter Notebook. We will talk about this more later.

## Modules: Importing
[Return to Table of Contents](#Table-of-Contents)

In [112]:
# Can also import specific functions from a library.
from numpy import zeros # This imports the zeros function only.
# Numpy Documentation: https://numpy.org/devdocs/user/whatisnumpy.html

In [113]:
zeros(5)

array([0., 0., 0., 0., 0.])

In [114]:
# To import and load a package you do the following.
import numpy as np # Note that many packages are defined as the abbreviation in this case np.
# The abbreviation can help save time in calling the library.

In [115]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [116]:
# Some places may show the following when importing a module:
#from numpy import * 
# This imports all functions from in a module and can be dangereous and can affect the namespaces.

In [117]:
print(np.__version__) # This prints the version of the np library.

2.1.3


In [118]:
# Another example
# https://docs.python.org/3/library/random.html
import random
#from random import randint

In [119]:
# Everytime cell runs creates a random integer between the given integers.
random.randint(4, 10)

9

"Randomness" of a function can still be controlled witha  random seed or random state. The "random seed" parameter may be defined at the library levelwhcih is the case for "random" library or in other cases may be defined as a parameter of the function.

In [120]:
# If we wanted the randint function to be repeatable we could use the "random seed".
# In other random functions, there may be a parameter called "random state"

# Even though it wouldn't be a true random it allows for reproducibility.
random.seed(a=8, version=2)
random.randint(4, 10)
# If in the loop, could change the parameter "a" = i which would change the seed parameter a but allows reproducibility.

5

In [121]:
# Creating a list of random integers:
random_integers_list = []
for i in range(10):
    random_integers_list.append(random.randint(-10,10))

random_integers_list

[1, 2, -6, -4, -9, -8, -6, -3, 6, -4]

We can select elements from this list in various ways.

In [122]:
# Selecting element 2. Note that element 0 is the first element.
random_integers_list[2]

-6

In [123]:
# Selecting element 1. Note that element 0 is the first element.
random_integers_list[-1]

-4

Starting in element 2, select every third element.

In [124]:
filtered_random_integers_list = []
for i in range(len(random_integers_list)):
    if (i-1) % 3 == 1:
        filtered_random_integers_list.append(random_integers_list[i]) # Creates a list with the filtred numbers.
filtered_random_integers_list # Show the final list.

[-6, -8, 6]

In [125]:
random_integers_list[2::3]

[-6, -8, 6]

# Modules: Installing
[Return to Table of Contents](#Table-of-Contents)

Recall that you should have used Anaconda distribution to install Python and its [included data science packages or libraries](https://docs.anaconda.com/free/anaconda/reference/packages/pkg-docs/).

Python also makes installing packages easy in general using conda on the command line. To open the command line do a search on the application "Anaconda Prompt" (i.e., command line, also called terminal).

One method to install packages is to use the Anaconda Package index to install a package using the following line: 
- "$conda install name_of_package"

One method to install packages is to use Pythons PIP package manager. To install a pakcage using pip can be done from the Jupyter Notebook code cell using the following line of code:
- "!pip install name_of_package"

PIP can also be used in the Anaconda Prompt command line. Conda access the Python index at Anaconda while the PIP access the index in the Python index site.

<b>Example</b>

Let's use the "Names" library as an example. To install "names" library Uncomment and run the cell below with "!pip install names". Names module documentation at: https://pypi.org/project/names/. After installing comment-out the line above. No need to run again the "!pip install names". Note that the asterisk* next to the cell means that the cell is running.

In [126]:
# To install "names" library Uncomment and run this cell to install the Names library.
#!pip install names

# After installing comment out the line above.

In [127]:
import names # After installation only thing that needs to be done is Importing the module.

ModuleNotFoundError: No module named 'names'

If a Warning related names.exe installed in 'DIRECTOR' which is not on PATH. Follow the instructions to add the path ot the Environmental Variables:
-  https://stackoverflow.com/questions/69547919/f2py-exe-is-somewhere-but-the-directory-isnt-on-path

In [128]:
print(names.__version__) # After importing I can check the version of a module.
# There are other ways to check the version.

NameError: name 'names' is not defined

In [129]:
# After importing I can use functions from the names library.
names.get_full_name()

NameError: name 'names' is not defined

# Alwasy remember to save your notebook!
In this class we will discuss how to use many libraries and their functions for each of the workflows in datascience.

[Return to Table of Contents](#Table-of-Contents)
# NOTEBOOK END