# Python Lexical Analysis

Lexical Analysis is the process that a programming language uses to understand a line of Python code. A single line of python can be made up of several components.

https://docs.python.org/3/reference/lexical_analysis.html


Everything I will teach you here is true wether the proposed line of code is in a Cell in Colab, in a python program or typed into a Python interpreter.

In [112]:
x = "thonis" in ("python" + "isfun")   #Example line of code
x

True

A line of python code can appear on more than one line.  If it ends in a "\" then it is explicitly line wrapped.  If it wraps in the middle of an expression or inside a data structure Python also allows implicit line wrapping.

## Different parts of a line of code

 - variables - Labels for a location in memory that stores a literal value
 - literal - A fixed value in code ("thonis", "python", "isfun" above)
 - keywords - Commands that python understands ("in" above)
 - operators - Used for calculations such as math, comparision, assignment ("=", "+" above)
 - delimiters - Used to group together parts of code "(" and ")" above
 - comments - Any text after a # (octothorp) is ignored by the interpeter and used to document your code.


## Variables

 - Variables are labels for locations in memory that store objects.
 - All variable names are case sensitive
 - All variable names must start with _ or A-Z or a-z
 - A variable on the left side of the = is an assignment (writing to the variable)
 - A variable on the right side of an equal sign is reading the variable
 - We also assign variables values when we call function, but we will save that discussion for later

In [6]:
a_number_between_1_and_10 = 5
the_age_of_something = 500
print(a_number_between_1_and_10)

5


In [7]:
a = 5
_123 = 10
x = _123 + a
print(x)

15


Python preassigns about 150 variables with values when the system loads.

In [8]:
dir(__builtins__)

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeError',
 'UnboundLocalError',
 'UnicodeDecode

Variables point to a location in memory.  Those are often literal values.

https://pythontutor.com/visualize.html#code=a%20%3D%2010%20%0Ab%20%3D%205%0Ac%20%3D%20900%0A%0Aa%20%3D%20b%20%2B%20c%20&cumulative=False&curInstr=0&heapPrimitives=True&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=False


## Literals

Literals are values that are placed in your program. There are many different types of literal values for example:

 - Integers :  9999, -1, 0
 - Floats:   3.14, 9.99, -0.01
 - Strings:  'Hello', "Bye\n", f"5 times 2 is {5*2}"
 - Booleans:   True, False
 - Bytes:  b"\x41\x42\x43', b"ABC"
 - Lists:  [1,2,3]
 - Dictionaries: {"record1":"value1", "record2":"value2"}

In [16]:
x = 10 <= 5
x = True
y = False
x == y

False

In [13]:
a = 'Mark'
x = """hello"""
x = f"Hello {a}"
print(x)

Hello Mark


In [10]:
x = "hello\tworld\tpython"
print(x)

hello	world	python


## Keywords 

Keywords are the commands that Python understands.  There are only about 35 of them although the number varies from version to version. 

Don't confuse "keywords" with executing the code stored in varaibles such as print() or len().  Those have a parenthesis after them.

ALL KEYWORDS have spaces after them if they accept arguments.  Examples:

```
"import module"
"from os import object"
"del variable"
"5 is 10"
"'i' in 'team'"
```

In [18]:
import keyword
keyword.kwlist

['False',
 'None',
 'True',
 '__peg_parser__',
 'and',
 'as',
 'assert',
 'async',
 'await',
 'break',
 'class',
 'continue',
 'def',
 'del',
 'elif',
 'else',
 'except',
 'finally',
 'for',
 'from',
 'global',
 'if',
 'import',
 'in',
 'is',
 'lambda',
 'nonlocal',
 'not',
 'or',
 'pass',
 'raise',
 'return',
 'try',
 'while',
 'with',
 'yield']

## Operators
Operators perform operations on operands.  

### Mathematical operators  +, -, /, //, *,**, and %

In [29]:
10 ** 3

1000

### Assignment operators =, +=, -=, *=, /=, //=, **=, %=
```
x = 10
x += 5   # same as x = x + 5
x -= 100  # same as x = x - 100
x *= 60   # x = x *60
```

In [31]:
x = 10
x

10

In [35]:
x *= 2
x

28

### Comparison operators 

Compare to objects and convert objects (often literals) into a Boolean Value of either True or False.

```
10 < 5
10 > 6
10 >= 9
10 <= 9
10 == 9
10 != 9
```

In [44]:
"a" < "B"

False

#### Logical Tests

Logic Tests combine boolean values using "and","or","not", "^"

The AND table

| X     | Y     | X AND Y |
|-------|-------|---------|
| True  | True  | True    |
| True  | False | False   |
| False | True  | False   |
| False | False | False   |

In [52]:
x = 10 < 5
y = 10 > 11
y and x

True

#### The OR table

| X     | Y     | X OR Y |
|-------|-------|--------|
| True  | True  | True   |
| True  | False | True   |
| False | True  | True   |
| False | False | False  |

In [56]:
False or True

True


#### The XOR table

Must be False in the AND table and True in the OR table

| X     | Y     | X XOR Y |
|-------|-------|---------|
| True  | True  | False   |
| True  | False | True    |
| False | True  | True    |
| False | False | False   |

In [58]:
False ^ True

True

#### The NOT table

| X     | NOT X |
|-------|-------|
| True  | False |
| False | True  |

In [62]:
not 10 == 10

False

### Bitwise Operators   &,|,~. >>, <<

```
10 >> 6
10 & 6
10 | 6
```

In [79]:
format(0b1100 ^ 0b1010, "04b")

'0110'

### Object instance operators
```
is
is not
```

In [81]:
x = 5
y = 5
x is y

True

### Object Membership
```
in
not in
```

In [84]:
"A" not in "this a test"

True

The order in which operators are processed is called the "Operator Precedence"

https://docs.python.org/3/reference/expressions.html#operator-precedence

You will see at the top of the order parenthesis, square brackets and other values.  These group together parts of the expression and are called "Delimiters"

In [85]:
(10 > 5) and (11 < (10 + 2) )

True

PEMDAS  = Paren, Exponents, Multiply, Div, add, sub

## Delimiters

Used to group together or extract parts of python code or expressions.  This includes []{}(),:'";\\.


Parenthesis can be used to enforce order of operations
```
x = (100 + 10)
x = ( 1 + (100 + 20))
```  
consider `3 * 3 + 2`     vs     `3 * (3 + 2)` 

In [90]:
(3 + 3) * 2

12



```
# This is formatted as code
```

Parenthesis can be used to execute code that is stored in a variable
```
print()
```

Parenthesis with comma can be used to pass arguments to code stored in a variable
```
print("hello", "world")
```

In [97]:
print( "hello", 3 + 7, 10/2)

hello 10 5.0


semicolons can separate multiple commands on a line

`x = 5; x = x + 100; print(x)`

In [99]:
x = 5 ; x = x + 10 ; print(x)

15


Period can be used to access attributes on objects.

```
import sys
sys.version
sys.flags
sys.getsizeof(x)
"test".upper()
```

Colon is used to create a block of code and group code together:

```
def print_banner():
    print("WARNING:")
    print("You have connected to my system. If you are not authorized I will prosecute.")
    print("Go away now before it is too late.")
```

In [100]:
def print_banner():
    print("WARNING:")
    print("You have connected to my system. If you are not authorized I will prosecute.")
    print("Go away now before it is too late.")

In [102]:
print_banner()

You have connected to my system. If you are not authorized I will prosecute.
Go away now before it is too late.


Can be used to separate and group together literals to form new data 

```
x = [1,2,3,4,5]
``` 

In [103]:
x = [1,2,3]

In [109]:
x[2]


3