- Title: Python Tips and Traps
- Slug: python-tips-and-traps
- Date: 2020-03-17 00:06:02
- Category: Computer Science
- Tags: programming, Python, tips, traps
- Author: Ben Du

https://docs.python-guide.org/writing/gotchas/

https://stackoverflow.com/questions/101268/hidden-features-of-python


1. One of the trickiest problem in Python is conflicting package/module name. 
    For example, 
    if you have a Python script named `abc.py` in the current directory 
    and your script depends on the Python module `collections.abc`,
    you code will likely fail to run. 
    Another (really tricky) eample is that if you have a Python script named `pyspark.py`
    and submit it to  a Spark cluster to run (using `spark-submit`).
    The PySpark application will likely throw an error saying that the `pyspark` module is not found.
    Those are just 2 commonly seen examples. 
    You can easily run into this issue when you run ad hoc Python scripts 
    (unlikely to encounter this issue when you develop a Python package).
    A possible way to avoid this issue is to always prefix your ad hoc Python script with a leading underscore (`_`).

2. The `int` type in Python 3 is unbounded,
    which means that there is no limit on the range of integer numbers that an `int` can represent. 
    However, 
    there are still various integer related limits in Python 3 due to the interpreter's word size (`sys.maxsize`)
    (which is the same as the machine's word size in most cases).
    This number is the size of the largest possible list or in-memory sequence.

In [1]:
x = 1111111111111111111111111111111111111111111111111111111111111111111111111111
type(x)

int

3. Use type annotation to make your code more readable and easier to understand.

2. Restrict the types of objects that your function/method can be applied to 
  and throw a (ValueError) exception when a wrong type is provided.
  This helps minimize surprisings.

3. AVOID returning objects of different types from a Python function/method.

4. Be CAREFULL about module name conflictions!
  It happens when there are more than one Python scripts with the same name on Python module search paths.
  This is often not a problem when developing Python packages due to absolute and relative import.
  However, 
  this issue can come up in a few situations and can be quite tricky for uers to figure out.
    a. If an user run a Python script whose parent directory contains a Python script with a name conflicting with other (official) Python modules,
      this issue will come up.
    b. An even trickier situation is when a Python script is piped to a Python interpreter directly. 
      According to [docs about sys.path](https://docs.python.org/3/library/sys.html#sys.path),
      `sys.path[0]` is initailized to be the empty string on the startup of Python in this situation 
      which directs Python to search modules in the current directly first.
      Since an user can run the command in any directory, 
      it is more likely for him/her to encounter the issue of conflicting module names. 
  If a Python script is never intended to be imported as a module, 
  one way to resolove the issue is to simply remove the first element from `sys.path`.
  

In [2]:
import sys
sys.path.pop(0)

'/workdir/archives/blog/misc/content'

7. Almost all modern programming languages follow the convention
  of not returnting anything (or in another words, retun void, None, etc.)
  from a mutator method so that you cannot chain on a mutator method.
  Functional programming languages enough chaining on methods
  as they often have immutable objects and the methods return new objects
  rather than changing the original objects.

2. Python functions (except lambda functions) do not automatically return value
  unlike functional programming languages.
  Forgotting a `return` statement is a common mistake in Python.

3. According to [Python Operator Precedence](https://docs.python.org/2/reference/expressions.html#operator-precedence),
  the ternary expression `if - else` has a very low precedence. 
  However, it has higher precedence than the tuple operator `,`.
  It is suggested that you always use parentheses to make the precedence clear 
  when you use the ternary expression in a complicated expression.
  Below is an example illustrating the precedence of the ternary expression.

In [7]:
update = {
    "status": "succeed", 
    "partitions": 52,
    "size": 28836,
    "end_time": 1563259850.937318
}
[key + " = " + f"{val}" if isinstance(val, (int, float)) else f"'{val}'" for key, val in update.items()]

["'succeed'",
 'partitions = 52',
 'size = 28836',
 'end_time = 1563259850.937318']

In [8]:
update = {
    "status": "succeed", 
    "partitions": 52,
    "size": 28836,
    "end_time": 1563259850.937318
}
[key + " = " + (f"{val}" if isinstance(val, (int, float)) else f"'{val}'") for key, val in update.items()]

["status = 'succeed'",
 'partitions = 52',
 'size = 28836',
 'end_time = 1563259850.937318']

10. Backslash (`\`) cannot be used in a f-string (introduced in Python 3.6).
  There are multiple ways to resolve this issue.
  First, you can precompute things needed to avoid using `\` in a f-string.
  Second, you can use `chr(10)` (which returns the backslash) instead.

5. If you need trackback information when throwing an exception use `raise ExceptionClass(msg)`,
  otherwise, use `sys.exit(msg)` instead.


6. You cannot use `return` to return result from a generator function.
  Instead a `return` behaves like raising a StopIteration.
  Please see this [issue](https://stackoverflow.com/questions/26595895/return-and-yield-in-the-same-function)
  for more discussions.

7. `max(some_iterable, default=0)`

8. `itertools.chain(iter1, iter2, ...)`

## Issues and Solutions

### Cannot Import an Installed Module

I have met the issue that some packages cannot be imported even if they have been installed.
The issue was due to file permissions (the installed Python packages are not readable).
A simple fix (even not optimal) is to make these Python packages readable, 
e.g., make the permissions `777` (sudo required).

## Referneces

- [Common Mistakes](http://www.toptal.com/python/top-10-mistakes-that-python-programmers-make)

- [Least Astonishment and The Mutble Default Argument](https://stackoverflow.com/questions/1132941/least-astonishment-and-the-mutable-default-argument)

- [Maximum and Minimum Values for Ints](https://stackoverflow.com/questions/7604966/maximum-and-minimum-values-for-ints)