- Title: Python Tips and Traps
- Slug: python-tips-and-traps
- Date: 2020-10-10 11:55:25
- Category: Computer Science
- Tags: programming, Python, tips, traps
- Author: Ben Du

https://docs.python-guide.org/writing/gotchas/

https://stackoverflow.com/questions/101268/hidden-features-of-python


1. One of the trickiest problem in Python is conflicting package/module name. 
    For example, 
    if you have a Python script named `abc.py` in the current directory 
    and your script depends on the Python module `collections.abc`,
    you code will likely fail to run. 
    Another (really tricky) eample is that if you have a Python script named `pyspark.py`
    and submit it to  a Spark cluster to run (using `spark-submit`).
    The PySpark application will likely throw an error saying that the `pyspark` module is not found.
    Those are just 2 commonly seen examples. 
    You can easily run into this issue when you run ad hoc Python scripts 
    (unlikely to encounter this issue when you develop a Python package).
    A possible way to avoid this issue is to always prefix your ad hoc Python script with a leading underscore (`_`).
    Another solution is to remove the emtpry string 
    (represent the current working directory from `sys.path`)
    if your Python script does not import other modules in the current directory.

        import sys
        sys.path.remove("")

2. The `int` type in Python 3 is unbounded,
    which means that there is no limit on the range of integer numbers that an `int` can represent. 
    However, 
    there are still various integer related limits in Python 3 due to the interpreter's word size (`sys.maxsize`)
    (which is the same as the machine's word size in most cases).
    This number is the size of the largest possible list or in-memory sequence.

In [1]:
x = 1111111111111111111111111111111111111111111111111111111111111111111111111111
type(x)

int

3. Use type annotation to make your code more readable and easier to understand.

2. Restrict the types of objects that your function/method can be applied to 
  and throw a (ValueError) exception when a wrong type is provided.
  This helps minimize surprisings.

3. AVOID returning objects of different types from a Python function/method.

4. Be CAREFULL about module name conflictions!
  It happens when there are more than one Python scripts with the same name on Python module search paths.
  This is often not a problem when developing Python packages due to absolute and relative import.
  However, 
  this issue can come up in a few situations and can be quite tricky for uers to figure out.
    a. If an user run a Python script whose parent directory contains a Python script with a name conflicting with other (official) Python modules,
      this issue will come up.
    b. An even trickier situation is when a Python script is piped to a Python interpreter directly. 
      According to [docs about sys.path](https://docs.python.org/3/library/sys.html#sys.path),
      `sys.path[0]` is initailized to be the empty string on the startup of Python in this situation 
      which directs Python to search modules in the current directly first.
      Since an user can run the command in any directory, 
      it is more likely for him/her to encounter the issue of conflicting module names. 
  If a Python script is never intended to be imported as a module, 
  one way to resolove the issue is to simply remove the first element from `sys.path`.
  

In [2]:
import sys
sys.path.pop(0)

'/workdir/archives/blog/misc/content'

7. Almost all modern programming languages follow the convention
  of not returnting anything (or in another words, retun void, None, etc.)
  from a mutator method so that you cannot chain on a mutator method.
  Functional programming languages enough chaining on methods
  as they often have immutable objects and the methods return new objects
  rather than changing the original objects.

2. Python functions (except lambda functions) do not automatically return value
  unlike functional programming languages.
  Forgotting a `return` statement is a common mistake in Python.

3. According to [Python Operator Precedence](https://docs.python.org/2/reference/expressions.html#operator-precedence),
  the ternary expression `if - else` has a very low precedence. 
  However, it has higher precedence than the tuple operator `,`.
  It is suggested that you always use parentheses to make the precedence clear 
  when you use the ternary expression in a complicated expression.
  Below is an example illustrating the precedence of the ternary expression.

In [7]:
update = {
    "status": "succeed", 
    "partitions": 52,
    "size": 28836,
    "end_time": 1563259850.937318
}
[key + " = " + f"{val}" if isinstance(val, (int, float)) else f"'{val}'" for key, val in update.items()]

["'succeed'",
 'partitions = 52',
 'size = 28836',
 'end_time = 1563259850.937318']

In [8]:
update = {
    "status": "succeed", 
    "partitions": 52,
    "size": 28836,
    "end_time": 1563259850.937318
}
[key + " = " + (f"{val}" if isinstance(val, (int, float)) else f"'{val}'") for key, val in update.items()]

["status = 'succeed'",
 'partitions = 52',
 'size = 28836',
 'end_time = 1563259850.937318']

10. Backslash (`\`) cannot be used in a f-string (introduced in Python 3.6).
  There are multiple ways to resolve this issue.
  First, you can precompute things needed to avoid using `\` in a f-string.
  Second, you can use `chr(10)` (which returns the backslash) instead.

5. If you need trackback information when throwing an exception use `raise ExceptionClass(msg)`,
  otherwise, use `sys.exit(msg)` instead.


6. You cannot use `return` to return result from a generator function.
  Instead a `return` behaves like raising a StopIteration.
  Please see this [issue](https://stackoverflow.com/questions/26595895/return-and-yield-in-the-same-function)
  for more discussions.

7. The module `importlib.resources` (since Python 3.7+) leverages Python's import system to provide access to resources within packages. 
  If you can import a package, 
  you can access resources within that package. 
  Resources can be opened or read, in either binary or text mode.
  Resources are roughly akin to files inside directories, 
  though it’s important to keep in mind that this is just a metaphor. 
  Resources and packages do not have to exist as physical files and directories on the file system. 
  `importlib.resources` is analogue to `getClass.getResource` in Java.



## Issues and Solutions

### Cannot Import an Installed Module

I have met the issue that some packages cannot be imported even if they have been installed.
The issue was due to file permissions (the installed Python packages are not readable).
A simple fix (even not optimal) is to make these Python packages readable, 
e.g., make the permissions `777` (sudo required).

## Advanced Python 

[Talk: Anthony Shaw - Why is Python slow?](https://www.youtube.com/watch?v=I4nkgJdVZFA)

[When Python Practices Go Wrong - Brandon Rhodes - code::dive 2019](https://www.youtube.com/watch?v=S0No2zSJmks)



<img src="http://dclong.github.io/media/python/python.png" height="200" width="240" align="right"/>

## Python Doc

https://docs.quantifiedcode.com/python-anti-patterns/index.html


## Functions

https://jeffknupp.com/blog/2018/10/11/write-better-python-functions/

https://softwareengineering.stackexchange.com/questions/225682/is-it-a-bad-idea-to-return-different-data-types-from-a-single-function-in-a-dyna

https://stackoverflow.com/questions/1839289/why-should-functions-always-return-the-same-type

https://treyhunner.com/2018/04/keyword-arguments-in-python/


[Comprehensive Python Cheatsheet](https://gto76.github.io/python-cheatsheet/)

- [The Python Wiki](https://wiki.python.org/moin/)
- [Useful Modules](https://wiki.python.org/moin/UsefulModules)
- [The Hitchhiker’s Guide to Python!](http://docs.python-guide.org/en/latest/)
- [PyVideo](http://pyvideo.org/)

## Environment Variable

1. `os.getenv` gets the value of an environment variable
  while `os.setenv` creates a new environment variable or
  sets the value of an environment variable.

2. You should use `os.pathexpanduser("~")` instead of `os.getenv('HOME')`
  to get the home directory of the current user in Python.
  `os.getenv('HOME')` only works on Linux/Unix.

## Programming Skills

1. Python varadic args can mimic function overloading

3. Python eval

4. `*args` and `**kwargs`. Arguments after `*` must be called by name.

6. `sys.stdout.write`, `sys.stdout.flush`, `print`

9. Global variables in a Python module are readable but not writable to functions in the same module by default.
  If you want to write to a global variable in a function (in the same module),
  you have to declare the global variable in the method using the keyword `global`.
  For example, if `x` is a global variable
  and you want to write to it in a method,
  you have to declare `global x` at the beginning of the method.


## Numerical

1. Division is float division in Python 3 which is different from Python 2.
    If you want integer division,
    use the `//` operator.

## Misc

1. Keys for set and dict objects must be immutable in Python
    (and the same concept holds in other programming languages too).
    Since a list in Python is mutable,
    it cannot be used as a key in set and dict objects.
    You have to convert it to an immutable equivalence (e.g., tuple).

2. Use sys.exit(msg) to print error message and quit when error happens

3. Get the class name of an object.

		type(obj).__name__

## File System

2. `os.mkdir` acts like `mkdir` in Linux and `os.makedirs` acts like `mkdir -p` in Linux.
    Both of them throw an exception if the file already exists.

3. Use `execfile(open(filename).read())` to source a file,
    variables defined in 'filename' are visible,
    however, imported packages are invisible to the script running execfile

## Encoding

`ord` `unichar`
return `ascii` number of characters
`chr` return a string from a ascii number

## Syntax

1. Python expression is calculated from left to right.

7. You can use a `dict` structure to mimic switch in other programming languages.
    However, it is kind of evil and has very limited usage.,
    You should avoid use this.
    Just use multiple `if ... else ...` branches instead.

5. `:` has higher priority than arithmetic operators in Python,
    which is opposite to that in R.

3.  `return v1, v2` returns a tuple `(v1, v2)`.
    And if a function `f` returns a tuple `(v1, v2, v3)`,
    you can use
    `v1, v2, v3 = f()`

11. Stay away from functions/methods/members starting with `_`.
    For example,
    you should use the built-in function `len` to get the length of a list
    instead of using its method `__len__`.

7. Python does not support `++`, `--` but support `+=`, `-+`, etc.


### Encryption

- [pycrypto](https://pypi.python.org/pypi/pycrypto)

https://github.com/dlitz/pycrypto

http://stackoverflow.com/questions/3504955/using-rsa-in-python

## Referneces

- [Common Mistakes](http://www.toptal.com/python/top-10-mistakes-that-python-programmers-make)

- [Least Astonishment and The Mutble Default Argument](https://stackoverflow.com/questions/1132941/least-astonishment-and-the-mutable-default-argument)

- [Maximum and Minimum Values for Ints](https://stackoverflow.com/questions/7604966/maximum-and-minimum-values-for-ints)

- [The Hitchhiker's Guide to Python](https://docs.python-guide.org/writing/documentation/)

- [Python日报](http://py.memect.com/)

- [Python Homepage](http://www.python.org/)

- [Python Documentation](http://docs.python.org/py3k/)

- [Useful Modules](https://wiki.python.org/moin/UsefulModules)

- [LearningJython](http://wiki.python.org/jython/LearningJython)

- [Jython Tutorial](http://www.jython.org/currentdocs.html)

- [PEP 8 -- Style Guide for Python](http://legacy.python.org/dev/peps/pep-0008/)

- [Documentating Python](https://devguide.python.org/documenting/)

- http://builtoncement.org/

- https://pythonhosted.org/pyCLI/