# Python Style guidelines

- [PEP-8](http://legacy.python.org/dev/peps/pep-0008/)
    - [flake8](http://flake8.readthedocs.org/en/latest/)

# Useful Python modules (stdlib & 3rd party)

- pip
- subprocess
    - Exercise
        - use the subprocess.run() and subprocess.Popen methods to launch a subprocess from a python script (or an interactive session)
        - use the subprocess.run capture_output argument to assign a value from the standard output of a CLI command to a variable
        - assign a value to a variable after running a PIPELINE through subprocess ---> eg: `echo hello | tr a-z A-Z` ... remember to pass the argument `shell=True`!
- random: pseudo-random numbers and company
    - random.choice
    - random.randrange
    - random.sample
    - random.shuffle
- [datetime](http://docs.python.org/3/library/datetime.html), [dateutil](http://labix.org/python-dateutil#head-8d03c6c25ead6f9cab0cde83e6f672b52480ab90), [icalendar](http://icalendar.readthedocs.org/en/latest/usage.html#example), pytz
- debugging, performance
    - [pdb](http://docs.python.org/3/library/pdb.html)
        - Exercise: run any python script and examine it with PDB OR your preferred IDE (ex: VS Code)
            - `python -m pdb my_script.py`
            Example code for debugging:
            ```python
            x = 0
            for _ in range(10):
                x += 1
            print(f"The final value of x is: {x}")
            print(random.randrange(1, 101))
            ```
            - notice that you can get a list of pdb commands by typing `?`
            - set at least one breakpoint using the `break` command and experiment with the `continue` , `next`, and `step` commands
            - use the `display` command to display the values of expressions at various points in the execution of your code
            - experiment with the `restart` command
            - experiment with assigning values to variables during a debugging session
            - experiment with the `list` command
    - [timeit](http://docs.python.org/3/library/timeit.html)
    - [cProfile](http://docs.python.org/3/library/profile.html)
        - profile a script using the command line (`python -m cProfile my_script.py`)
        - sort by `cumtime`
        - save the result to a text file
        - save the result in the default binary output format
            - data in the binary format can be visualised using several tools, ex: [gprof2dot](https://code.google.com/p/jrfonseca/) (needs the `graphviz` package, which provides the `dot` command, to be installed)
            - `gprof2dot -f pstats cProfile.data | dot -Tpng -o my_profile.png`
- testing
    - [tools overview](https://wiki.python.org/moin/PythonTestingToolsTaxonomy)
    - [doctest](http://docs.python.org/3/library/doctest.html)
        - Using the stdlib documentation for the `doctest` module, write at least 3 doctests for code you've written so far in class
        - Run your doctests as follows:
            - `python -m doctest my_module.py`
            - For VERBOSE output: `python -m doctest -v my_module.py`
    - [unittest](http://docs.python.org/3/library/unittest.html)
    - [nosetests](https://nose.readthedocs.org/en/latest/) (3rd party)
    - [pytest](http://pytest.org/latest/index.html) (3rd party)
        - EXERCISE: Unit testing
            - Use `pytest` create a testsfor a module you've written in class (it should contain AT LEAST one function).
            - Run your unit tests via: `pytest`
            - **Test Driven Development (TDD) exercise:**
            - ADD a test that tests some AS YET NONEXISTANT functionality (ex: a new function not yet defined)
                - run your tests, watch the new one FAIL
                - Add the code to your module to make the new test PASS
- Networking
    - tcp socket listen / connect
- Data Serialization
    - unstructured text formats (ie make up your own)
    - structured text formats: [json](http://docs.python.org/3/library/json.html), [configparser](http://docs.python.org/3/library/configparser.html), [yaml](https://github.com/yaml/pyyaml)
        - To install yaml: `poetry add PyYAML`
    - [dbm](http://docs.python.org/3/library/dbm.html) / [shelve](http://docs.python.org/3/library/shelve.html) / [pickle](http://docs.python.org/3/library/pickle.html)
        - Exercise:
            - Create a python script that reads a JSON config file, for example the following, stored as `config.json`:
                ```json
                {
                  "user": "vglobal",
                  "base_url": "http://10.13.37.33:8888",
                  "debug_mode": true
                }
                ```
            - Next, you'll need to `import json`.
            - The `json.load` function will allow you to load JSON data
            - The `json.dump` function will allow you to save JSON data
            - make your script make use of each config value
                - ex: if debug_mode, print "We're in debug mode!"
            - have the script SAVE the parsed config file data in a DIFFERENT location, `/tmp/config` before exiting
    - [sqlite3](http://docs.python.org/3/library/sqlite3.html)
        - Exercise:
            - Use the sqlite3 module to:
                - create a new database
                - connect to the database
                - create a new table transfers with fields
                    - ip address
                    - site
                    - filename
                    - upload / download
       - insert at least 3 rows into the table
       - run some SELECT queries
    - [sqlalchemy (ORM)](http://docs.sqlalchemy.org/en/latest/orm/tutorial.html)
- re: regular expressions
- Web, HTML
    - [webbrowser](http://docs.python.org/3/library/webbrowser.html)
    - [urllib](http://docs.python.org/3/library/urllib.html)
    - [requests](http://docs.python-requests.org/en/latest/)
    - [BeautifulSoup](http://www.crummy.com/software/BeautifulSoup): for "quick turnaround screen scraping projects"
    - [lxml](http://lxml.de/)
        - Exercise:
            - For a website containing image elements, use `requests` and `Beautiful Soup`, and optionally `feedparser` for RSS, to parse a page, and return a list of the image elements
            - ex:
            
            
            ```
            import requests
            
            from bs4 import BeautifulSoup
            
            
            res = requests.get(URL)
            
            # for HTML, XML, RSS
            soup = BeautifulSoup(res.text, "html5lib")
            soup.select(...)
            
            # for JSON
            data = res.json()
            data['key'][0]['value']...
            
            # alternate approach for RSS
            data = feedparser.parse(MY-RSS-URL)
            ```
           
           
    - [cssselect](http://pythonhosted.org/cssselect/)
    - [scrapy](http://scrapy.org): high-level framework for writing your own scrapers for websites
- web frameworks
    - [bottle](http://bottlepy.org/docs/dev/index.html)
        - Exercise:
            - Create an API with at least 3 endpoints, and then test out your API from your web browser
    - [flask](http://flask.pocoo.org/) --- similar to bottle
    - [django](https://djangoproject.com/)
- [distutils](http://docs.python.org/3/library/distutils.html)
    - Exercise:
        - package up a module that you have written using the distutils package ([example](http://docs.python.org/3.3/distutils/introduction.html#a-simple-example), [example2](http://docs.python.org/3.3/distutils/setupscript.html#writing-the-setup-script))
        - ex:
            ```
            [create setup.py]
            
            python3 setup.py sdist
            sudo python3 setup.py install --record files.txt
            
            [test importing your module while cd'd to another directory]
            
            # will uninstall "egg-info" file, but not module
            sudo pip3 uninstall <MODULENAME>
            
            # to truly uninstall files, you can do (if you are SURE that everything in files.txt should be deleted)
            sudo xargs rm -rf < files.txt
            ```
- [setuptools](http://pythonhosted.org/setuptools/setuptools.html#developer-s-guide)
    - Exercise:
        - Package up some code that you've written so far in the class for distribution via `setuptools` (i.e. write a `setup.py` file)
            - refer to the setuptools "developer's guide" for reference on the `setup` function
        - Optionally create a `wheel` archive file for distribution using the `wheel` 3rd party library
- [logging](http://docs.python.org/3/library/logging.html)
    - see in particular the [basic tutorial](http://docs.python.org/3/howto/logging.html#logging-basic-tutorial), the [basicConfig method](http://docs.python.org/3/library/logging.html#logging.basicConfig), and the [logrecord attributes](http://docs.python.org/3/library/logging.html#logrecord-attributes)
        - Exercise:
            - write a script that outputs a log to `/tmp/myscript.log` using the logging.basicConfig method
            - configure a custom output format for your logger (by passing `format="FORMATSTRING"` to your basicLogger method) (see logrecord-attributes documentation link above)
            - after setting up your logger using the basicConfig method, use methods like `logging.warning, logging.info, logging.debug, logging.critical` to write to your logfile
            - inspect your logfile from the console (ex: with `cat` or `less`) and ensure the file is being logged to
- misc modules
    - [csv](http://docs.python.org/3/library/csv.html)
    - [glob](http://docs.python.org/3/library/glob.html)
    - [feedparser](http://pythonhosted.org//feedparser/) (3rd party)
        - ex:
            ```
            import feedparser
            feed = feedparser.parse('http://www.reddit.com/r/DailyProgrammer/.rss')
            [e['link'] for e in feed['entries']]
            ```


## re
- [documentation](http://docs.python.org/3/library/re.html)

In [19]:
HTTP_URL = re.compile('\s*http://.*', re.I)
m = HTTP_URL.match('http://news.com')

## random
- [documentation](http://docs.python.org/3/library/random.html)
- random.choice
- random.randrange
- random.sample
- random.shuffle
- random.seed

        random.seed(12345)
        random.random() --> 0.41661987254534116 # will always be the same starting from same seed

In [4]:
import random

In [88]:
# shuffle the cards list in place
cards = ['king of hearts', 'jack of spades', 'two of diamonds']
random.shuffle(cards)
cards

['two of diamonds', 'jack of spades', 'king of hearts']

In [42]:
# return a random iteger in the range provided
random.randrange(5,600)

450

In [24]:
# return a random element from the list
random.choice(['monkey', 'towel', 'volkswagen'])

'towel'

In [71]:
# print 10 random values from a list/population (here provided by the range)
random.sample(range(1,100), 10)

[78, 11, 88, 2, 98, 19, 27, 54, 31, 30]

In [93]:
random.seed(123)

In [94]:
random.randrange(11)

0

In [95]:
random.randrange(11)

4

## subprocess
- [documentation](http://docs.python.org/3/library/subprocess.html)
- launch a subprocess with `subprocess.call()` (see also the `shell=True` argument)
    - return the output of a command (ex: `ls`) in a variable with subprocess.check_output()

## socket
- [documentation](http://docs.python.org/3/library/socket.html)
    - [example](http://docs.python.org/3/library/socket.html#example)

In [21]:
# Echo server program
import socket

HOST = ''                 # Symbolic name meaning all available interfaces
PORT = 50007              # Arbitrary non-privileged port
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((HOST, PORT))
s.listen(1)
conn, addr = s.accept()
print 'Connected by', addr
while 1:
    data = conn.recv(1024)
    if not data: break
    conn.sendall(data)
conn.close()

Connected by ('127.0.0.1', 38185)
