# Performance 1

Recommendation: include all `import` statements in a cell at the top of the notebook file or your script file (`.py`).

### Two styles of import

1. `from <module> import <some_function, some_variable>`
    - invocation `some_function()`
2. `import <module>`
    - invocation `<module>.some_function()`

In [1]:
# import statements

# TODO: use from style of import for importing "check_output" from subprocess
from subprocess import check_output

# TODO: use import style of import for importing "time" module
import time

### How to open documentation about a function inside `jupyter`?
Press "Shift + tab" after entering function name.

In [2]:
# TODO: open documentation for check_output
check_output

<function subprocess.check_output(*popenargs, timeout=None, **kwargs)>

### What does `check_output` do?

Enables us to run a command with or without arguments. It returns the output of the command.
- Argument: command to run
- Return value: output of the command as a `byte` object.

In [3]:
# TODO: invoke check_output to execute "pwd"
pwd_output = check_output("pwd")
pwd_output

b'/Users/msyamkumar/Desktop/Teaching/Spring23_CS320/cs320-s23-lectureMaterials/lec/04-perf1\n'

In [4]:
# TODO: use type function call to check the output type of check_output
type(pwd_output)

bytes

### What is a `byte` object?

- `byte` is an example of a sequence.

- Recall that `list`, `str`, `tuple` are examples of Python sequences.
- Key sequence features:
    - indexing `seq[index]`
    - slicing `seq[start index:exclusive end index]`
    - iteration `for val in seq:`
    - length `length(seq)`
    - existence / constituency match `<val> in seq`

In [5]:
# TODO: use indexing to extract value at index 0
pwd_output[0]

47

### `byte` conversion to `str`
- requires details about encoding
- `str(<byte_variable>, <encoding>)`
- Most programs in linux use `utf-8` encoding

In [6]:
# Can we just convert bytes directly into str?
# Not really, you need specify the encoding
str(pwd_output)

"b'/Users/msyamkumar/Desktop/Teaching/Spring23_CS320/cs320-s23-lectureMaterials/lec/04-perf1\\n'"

In [7]:
# TODO: let's try utf-8 encoding
pwd_output_str = str(pwd_output, "utf-8")
pwd_output_str

'/Users/msyamkumar/Desktop/Teaching/Spring23_CS320/cs320-s23-lectureMaterials/lec/04-perf1\n'

Recall that, when you print an `str`, it formats the output.

In [8]:
print(pwd_output_str)

/Users/msyamkumar/Desktop/Teaching/Spring23_CS320/cs320-s23-lectureMaterials/lec/04-perf1



In [9]:
# You must use the correct encoding, otherwise the conversion will fail
str(pwd_output, "cp273")

'\x07íËÁÊË\x07_Ë`/_,Í_/Ê\x07àÁË,È?ø\x07èÁ/[ÇÑ>Å\x07ëøÊÑ>Å\x16\x93^{ë\x93\x16\x90\x07[Ë\x93\x16\x90\x05Ë\x16\x93\x05%Á[ÈÍÊÁ(/ÈÁÊÑ/%Ë\x07%Á[\x07\x90\x94\x05øÁÊÃ\x91\x8e'

### `str` methods recap

- `<str_variable>.strip()`: removes leading and trailing whitespace
- `<str_varaible>.split(<separator>)`: returns list of strings split by separator

In [10]:
# TODO: try strip method
pwd_output_str.strip()

'/Users/msyamkumar/Desktop/Teaching/Spring23_CS320/cs320-s23-lectureMaterials/lec/04-perf1'

In [11]:
# TODO: try split method using "/" as separator
pwd_output_str.split("/")

['',
 'Users',
 'msyamkumar',
 'Desktop',
 'Teaching',
 'Spring23_CS320',
 'cs320-s23-lectureMaterials',
 'lec',
 '04-perf1\n']

In [12]:
# You can string methods or function calls together
# TODO: first strip and then split the string
pwd_output_str.strip().split("/")

['',
 'Users',
 'msyamkumar',
 'Desktop',
 'Teaching',
 'Spring23_CS320',
 'cs320-s23-lectureMaterials',
 'lec',
 '04-perf1']

### What does `check_output` do when the command doesn't exist?
- `FileNotFoundError`

In [13]:
# TODO: invoke check_output by passing "hahaha" as argument
check_output("hahaha")

FileNotFoundError: [Errno 2] No such file or directory: 'hahaha'

### How can we use `check_output` to execute a command with arguments?

- option 1: pass the command with arguments as a string and pass `True` as argument to parameter `shell`
- option 2: pass a list of strings; for example: `[<command>, <arg1>, <arg2>]`

### git --version

In [14]:
# TODO: use option 1 to run "git --version"
check_output("git --version", shell=True)

b'git version 2.32.1 (Apple Git-133)\n'

What would happen if we switch the order of the two arguments? Recall that positional arguments should come before keyword arguments.

In [15]:
check_output(shell=True, "git --version")

SyntaxError: positional argument follows keyword argument (2083628834.py, line 1)

In [16]:
# TODO: use option 2 to run "git --version"
check_output(["git", "--version"])

b'git version 2.32.1 (Apple Git-133)\n'

In [17]:
# TODO: combine check_output with str typecast
git_version_str = str(check_output(["git", "--version"]), "utf-8")

In [18]:
# TODO: write code to extract just the version number
print(git_version_str.strip().split(" ")[-1]) # option 1
print(git_version_str[-7:-1]) # option 2

Git-133)
t-133)


### How long does it take to run code?

Let's learn about `time` module `time` function. It returns the current time in seconds since epoch.

What is epoch? epoch is January 1, 1970. **FUN FACT**: epoch is considered beginning of time for computers.

In [19]:
time.time() # number of seconds since Jan 1, 1970

1675230704.639127

In [20]:
start_time = time.time()
# DO SOMETHING (e.g., check_output)
end_time = time.time()

print(end_time - start_time)

7.700920104980469e-05


In [21]:
# TODO: let's convert to milliseconds
print((end_time-start_time) * 1e3)

# TODO: let's convert to microseconds
print((end_time-start_time) * 1e6)

0.07700920104980469
77.00920104980469


How long does it take to run simple computations (example: 4 + 5)?

In [22]:
start_time = time.time()
x = 4 + 5
end_time = time.time()

print(end_time - start_time)

0.00014400482177734375


How long does it take to print simple computations (example: 4 + 5)?

In [23]:
start_time = time.time()
print(4 + 5)
end_time = time.time()

print((end_time-start_time) * 1e3)

9
0.762939453125


Printing is a relatively slow operation. If your program is printing lot of things, its performance might get impacted!

How long does it take to run a python program?

Let's do a recap of python interactive mode.
`python3 -c "code"`

In [24]:
start_time = time.time()
check_output(["python3", "-c", "print(4 + 5)"])
end_time = time.time()

print((end_time-start_time) * 1e3)

58.691978454589844


### Everytime we run a command, we get slightly different output. How can we eliminate the noise?

Let's try this with "pwd".

In [25]:
start_time = time.time()
check_output("pwd")
end_time = time.time()

print((end_time-start_time) * 1e3)

23.99897575378418


Recall that `range` built-in function produces a sequence of integers starting at 0.

In [26]:
iters = 1000

start_time = time.time()
for i in range(iters):
    check_output("pwd")
end_time = time.time()

print((end_time-start_time) * 1e3 / iters)

6.507018089294434


### Data structures review
- lists (sequence: ordered)
- sets (not a sequence: not ordered): 
    - indexing doesn't work, but `in` operator works
    - only stores unique values

In [27]:
# TODO: create a simple list of integers
some_numbers = [11, 22, 33]
some_numbers

[11, 22, 33]

In [28]:
# TODO: use range() to produce a list containing 1000000 numbers
some_numbers = list(range(1000000))

`in` operator: existence / constituency match

In [29]:
100 in some_numbers

True

In [30]:
-20 in some_numbers

False

How long does `in` operator take? It kind of depends on the location of the item we are searching.

In [31]:
# TODO: time how long it takes to find 99 in some_numbers
start_time = time.time()
99 in some_numbers
end_time = time.time()

print((end_time-start_time) * 1e3)

0.18405914306640625


In [32]:
# TODO: time how long it takes to find 999999 in some_numbers
start_time = time.time()
999999 in some_numbers
end_time = time.time()

print((end_time-start_time) * 1e3)

23.424863815307617


In [33]:
# TODO: time how long it takes to find -1 in some_numbers
start_time = time.time()
-1 in some_numbers
end_time = time.time()

print((end_time-start_time) * 1e3)

22.57990837097168


In [34]:
# TODO: create a simple set of numbers
some_set = {11, 22, 33}
some_set

{11, 22, 33}

In [35]:
# TODO: convert some_numbers into set
some_set = set(some_numbers)

In [36]:
# TODO: time how long it takes to find -1 in some_numbers
start_time = time.time()
-1 in some_set
end_time = time.time()

print((end_time-start_time) * 1e3)

0.12993812561035156
