<a href="https://colab.research.google.com/github/CCIR-Academy/Techcamp2021S-Phase1/blob/main/Section_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Section 3: The Python Standard Library & Further on IDEs and Utilities




## Assignment

## What is "library"
- In common words, a library is a module of code with which you can directly incorporate into your own code. In this way, you don't have to know about how it actually works if you only need to have something done, or you may not need to design and implement some features by yourself any more. 
- Python comes with a standard set of libraries called "The Python Standard Libraries" from which you can directly import modules without installing them first. Aside from this, you can also install third-party modules with `pip`.

## How to use a library
- Importing: In Python, we use the keyword `import` to reference external code that is located in either The Python Standard Libraries, installed third-party libraries, or other files referenced with relative paths.
    - In common conduct, we would put all the importing commands at the begining of the file.
    - We can also import a small section of a library by using `from` in order to save time or space spent during the loading process.
    - Moreover, we can change the variable name with `as` when we import modules as we prefer.

In [None]:
import os
import 
print(os.getcwd()) # This line prints the current working directory.

In [None]:
from os import getcwd
print(getcwd())

In [None]:
from os import getcwd as getCurrentWorkingDirectory
print(getCurrentWorkingDirectory())

# The Python Standard Library (Intermediate)
- We start from "intermediate" because all the built-in types we learn in the previous sections are also parts of The Python Standard Libraries.
- In different distributions of Python (e.g., the "standard" python, Anaconda, MicroPython (for embeded development)), the list of defaultly available packages may vary, but you can manage packages with `pip`.

## Other Commonly Used Built-in Data Types


### datetime


In [None]:
from datetime import datetime

now = datetime.now() # obtain the datetime now
print(now)
print(type(now))

aRandomDateTime =  datetime.strptime('2015-6-1 18:19:59', '%Y-%m-%d %H:%M:%S') # We can generate a datetime object by parsing a string
print(aRandomDateTime)


### JSON and Other Serialization
- Serialization: When we want to access or transfer data across environments or platforms, we use serialization to convert the data into text-like that can be easily handled as long as we know the way to parse and encode them.
- JSON: JSON is a syntax for storing and exchanging data. JSON is text, written with JavaScript object notation.
    - As it is commonly used in JavaScript which is the absolute major language used in web content, most programming languages supports JSON in different ways. In Python, you can find much similarities between JSON `object` and Python `dict` type.
    - JSON is very helpful to store small amount of data for light workload (e.g., storing configurations, profiles) or network transmission. However, it is not easy and effective for more complicated operations.

In [None]:
import json

# Parsing JSON
x =  '{ "name":"John", "age":30, "city":"New York"}' # Some JSON in string
y = json.loads(x) # Parse x:
print(y["age"]) # The result is a Python dictionary:

# Outputting JSON
x = {
  "name": "John",
  "age": 30,
  "city": "New York"
} # A Python object (dict):
y = json.dumps(x) # Convert into JSON:
print(y) # The result is a JSON string:

| Python | JSON   |
|--------|--------|
| dict   | Object |
| list   | Array  |
| tuple  | Array  |
| str    | String |
| int    | Number |
| float  | Number |
| True   | true   |
| False  | false  |
| None   | null   |

- base64: Some data are stored natively in binaries that are not processible directly such as media files like images, but we can use `base64` to encode them into strings. 
    - This may work fully as intended if there is limit on the size of the content.


In [None]:
import base64

#If file cannot be found, runtime env has been recycled. Image has to be uploaded again
with open('/content/Section_3/KingsCollegeChapelWest.jpg', 'rb') as f:  # Load image as binaries
    data = f.read()
    encodestr = base64.b64encode(data) # Encode image into base64 bytes.
    print(encodestr)  #

b'/9j/4AAQSkZJRgABAgAAZABkAAD/7AARRHVja3kAAQAEAAAAPAAA/+4ADkFkb2JlAGTAAAAAAf/bAIQABgQEBAUEBgUFBgkGBQYJCwgGBggLDAoKCwoKDBAMDAwMDAwQDA4PEA8ODBMTFBQTExwbGxscHx8fHx8fHx8fHwEHBwcNDA0YEBAYGhURFRofHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8f/8AAEQgDAAQBAwERAAIRAQMRAf/EAMgAAAICAwEBAQAAAAAAAAAAAAABAgMEBQYHCAkBAAMBAQEBAQAAAAAAAAAAAAABAgMEBQYHEAACAQMDAQUGAwQGBgcEABcBAgMAEQQhEgUxQVEiEwZhcYEyFAeRoSOxwUIV8NHhUmIW8XKCkjMkokNTNCUXCLLSRFTCY5OjdJQ1VXODs9MmZOKEtNRllTdXGBEAAgIBAwMDAwIGAQMEAgMAAAERAgMhMRJBUQRhIhNxMgWBFJGhQlIjMxXwscHR8WJy4SSSQzT/2gAMAwEAAhEDEQA/ANbYg6D2V9sfLsYuT09tIUyPQ6flQgnUP6GgYiT1tpRoDGT7dD+NEAg0t3ftoEx6H/TTBtdB20/ZSKGNNfwpiGendSQmw1v/AFUwADvNA4Ae+gQftogQwSOppDHoaYSGnZ8aIEx9OnSgBG9Awv2W6UQA7a3/ACpoBAdvbQIZ7NKAAA2Pb3ik2CD9tMAt+FAD6WoB6D7aQSPt/dRAMDb+qgEPXT2UxAfy7ulA2BA7LUBIrDs07L0CkfZ7aAgdrHpQAWtr39lIGGv40wC1ANDI0F6QC+NMQ/dQOR2/KkmDQ9aB9BH8aYToFu6gSQWoDUYoGP2UCQ/6WpBIh/ophIW/CgYWN6AgLf10BA7UgD20wY7f6KAHpfSkxyI9ppolhagaGOtAgHsoAdtaBhbvtQxINKQ2ApikdqBwg7PdQDA20oFIie38KI

### Logging (logging)
- Sometimes we would want to generate logs for the purpose of debugging or technical support, or simply generate instant feedback so that you can know if the program is running as expected. In some cases, we would want to store the logs so we can access them after the program crashes.
- Read About Logging: https://realpython.com/python-logging/

In [None]:
import logging

logging.debug('This is a debug message')
logging.info('This is an info message')
logging.warning('This is a warning message')
logging.error('This is an error message')
logging.critical('This is a critical message')

### File System IO (os)
- Sometimes we may need to handle files within the local file system, or access some environment variables; in some other cases, we might run into issues with differences in operation systems. As a multi-platform language, Python Standard Libraries implements a universal experience for such kind of operations.

In [None]:
with open('/content/Section_3/HelloWorldFile.txt','w') as f:
    f.write("Hello World!")

In [None]:
with open('/content/Section_3/HelloWorldFile.txt', 'r') as f:
    print(f.read())

Hello World!


In [None]:
import os
print(os.name) # The result may vary on different environments.
print(os.environ.get('PATH'))

posix
/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tools/node/bin:/tools/google-cloud-sdk/bin:/opt/bin


### Path Handling (os.path)
- More than often, we will need to generate paths for files, but there will be issues if we move from one terminal/os to another as they may parse path-like strings in distinct ways; to address that, use os.path.join()

In [None]:
print(os.path.abspath('.')) # Print the absolute path for the current directory
print(os.path.join(os.path.abspath('.'), 'testdir')) # Generate a path object that is compatible across platforms 
os.mkdir(os.path.join(os.path.abspath('.'), 'testdir')) # Make a directory with the path specified
os.rmdir(os.path.join(os.path.abspath('.'), 'testdir')) # Remove a directory with the path specified

/content
/content/testdir


### Database (SQLite)
- When we want to store data in a complex structure, or when we expect a high workload of accessing and modifying data, list-like or object-like data types are not as effective in comparison to database structures which support SQL. In Python, SQLite has been included into the Python Standard Libraries as it is easy to establish and use, and available just as a file.
- For more detailed guide, read the official documentation, or this article: https://www.digitalocean.com/community/tutorials/how-to-use-the-sqlite3-module-in-python-3

In [None]:
import sqlite3
connection = sqlite3.connect('test.db') # We connect to the database in the form of a .db file; if it doesn't exist, it will create one.
cursor = connection.cursor() # We need to create a cursor to execute a query. 
cursor.execute('create table user (id varchar(20) primary key, name varchar(20))') # We create a table called 'user' with SQL commands.
cursor.execute('insert into user (id, name) values (\'1\', \'Michael\')') # We insert an entry to the table we created

print(cursor.rowcount) # We obtain the row count of the table
cursor.close() # We close the cursor
connection.commit() # We commit the changes we have made to the database.
connection.close() # We close the connection.

1


## Further on IDEs and Utilities


### Linting
- Linting is a process in which your code would be automatically inspected for potential errors or violations against the rules either customized or from a styling standard. Nowadays, within IDEs, linting is done realtime as you code and save; nonetheless, you can always adjust the rules as you prefer. In team coding or open source coding, it is effective to establish agreement on a popular standard for styling.
- Recommended Settings: Pylint with PEP8
    - https://www.pylint.org/


### Markdown
- Markdown is a lightweight and easy-to-use syntax for styling all forms of writing on the GitHub platform.
- It is extremely poplar as it is easy to read, write, and render stylishly without embedding excessive code.
- GitHub provides an excellent guide which also incorporates specialty features for Markdown documents on GitHub: https://guides.github.com/features/mastering-markdown/
- Colab supports some more features for Markdown in addition to those originally supported by Jupyter Notebooks; you can even create a form inside Colab
    - https://colab.research.google.com/notebooks/forms.ipynb#scrollTo=62YnDE7i9dqP

### Virtual Environment with Conda
- Why Virtual Environment: Whereas Python is an intepretive programming language that often requires no compilation in advance, it is highly sensitive to the running environment from which it will import modules since it does not come with any pre-bundled library. This would become a mess if there is inconsistency in the version or even availability of certain modules.
- Anaconda: As discussed in Section 1, Anaconda is one of the most popular Python distributions which integrates utilities for managing virtual environment.

In [None]:
# Note: This block of code is used for local environments, and not executable on Colab

conda init # This initializes Anaconda for local environment. A bracket with the name of the currently active environment should appear before the command line. In default case, it should show the default environment `base`
conda create -n myVirtualEnvironment  python=3.8 # This line creates a new virtual environment called `myVirtualEnvironment` with the option -n, and specified 3.8 for Python version.
conda activate myVirtual Environment # This line activates the newly created virtual environment whose name should appear in the bracket at the beginning of the command line.
conda deactivate # This line returns to the `base` environment.

- Understanding Virtual Environments: Just like other virtual environment utilities, creating virtual environments in Anaconda isolates the development environment with the local running environment; as such, with the usage of requirements.txt, you can easily reproduce the exact and minumum environments for your application.

### Package management with pip and conda
- `pip` is the default package/module management utility for Python; when used in virtual environments, `pip` would only modify pacakages in the currently active environment. With `pip list`, you can see all the installed packages in the currently active environment.
- To install a package, you simply key in `pip install <packageName>`, which should often be also specified in the documentation of the package to be installed.
- To install a collection of packages listed in requirements.txt (which is often the case for shared/open source projects), use `pip install -r requirements.txt`.
- To output a list of currently installed packages for sharing, use `pip freeze >requirements.txt`

# Assignments

## Task 1: Handling JSON
- In this task, we need to obtain data for the latest price for Bitcoin/USDT from WebSocket service provided by Binance into python. 
- To do so, we need to send a JSON object in string to specify what we intend to retrieve. In return, the service will continuously send back the data in JSON string as well. 
    - Objective 1: In Python, we first construct a `dict` object in the following way, and please properly convert it to a `JSON string` before we send it to the websocket service.
```python
requestBody: dict = {
"method": "SUBSCRIBE",
"params":
[
"btcusdt@aggTrade"
],
"id": 1
}
```
    - Objective 2: In return, the websocket sends back the following JSON strings. Please parse them and convert them into **One** python `list`.
```python
messageOne = "{\"e\":\"aggTrade\",\"E\":1624518980050,\"a\":637779838,\"s\":\"BTCUSDT\",\"p\":\"32991.14\",\"q\":\"0.001\",\"f\":1075515676,\"l\":1075515676,\"T\":1624518980046,\"m\":true}"
messageTwo = "{\"e\":\"aggTrade\",\"E\":1624518979662,\"a\":637779822,\"s\":\"BTCUSDT\",\"p\":\"32994.71\",\"q\":\"0.172\",\"f\":1075515629,\"l\":1075515629,\"T\":1624518979613,\"m\":false}"
messageThree = "{\"e\":\"aggTrade\",\"E\":1624519568742,\"a\":637792564,\"s\":\"BTCUSDT\",\"p\":\"33068.99\",\"q\":\"0.002\",\"f\":1075543663,\"l\":1075543663,\"T\":1624519568651,\"m\":true}"
```
- For easy testing, you can use this online WebSocket tester: https://www.websocket.org/echo.html
- For the official documentation in case you want to try something more than just the tasks: https://binance-docs.github.io/apidocs/futures/en/#live-subscribing-unsubscribing-to-streams

 


## Task 2: Loading files
A dataset of data collected from the forementioned websocket has been uploaded to the repo. Please download it from the repo, upload it to your Colab notebook environment as guided, and load these data into **One** Python `list`.


```python
#Only works with google chrome
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))
```
- It will prompt you to select a file. Click on “Choose Files” then select and upload the file. Wait for the file to be 100% uploaded. You should see the name of the file once Colab has uploaded it.

## Task 3: Database Handling

Create a SQL database for the price data you obtain from the `JSON` file and then answer the following questions by querying data from the database (refer to Step 3 of https://www.digitalocean.com/community/tutorials/how-to-use-the-sqlite3-module-in-python-3).


1.   What was the first trade ID of the trade that happended at an event time of "1624711290616"?
2.   List all datasets, which have a quantity of greater than 1.
3.   Using COUNT, estimate the number of datasets where the quantiy was lower than 0.1.
4.   Using AVG, calculate the average price of all datasets.





## Task 4: Write a Markdown


Create a short markdown file which shows the topics of each week of the coding camp so far in a **list**.
Moreover, include a **tasklist** for all tasks of each week and select all you have completed already.