# Some Coding Tips

## Python extension with PyCharm inspection

Please make the following statements at top of `pyscf/__init__.py` (change pyscf directly instead of dh):
```python
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
```
Then `import pyscf.dh` would not show error.
This is related to https://youtrack.jetbrains.com/issue/PY-38434.


## Using `lib.call_in_background`

In [1]:
from pyscf import lib
import numpy as np
import time

In this case, we simulate the following case:

- For each time, element `a` requires 300 ms to read from disk to memory;
- Evaluating `a + b` requires 1000 ms.
- 5 Iterations is required.

This task can be done by convenient class `pyscf.lib.call_in_background` from PySCF.

In [2]:
a_in_disk = np.array(range(10)) * 2

In [3]:
def read_300ms(a_in_comp, idx):
    """ Read `a_in_mem[idx]` to `a_in_comp` for 300ms. """
    # to avoid index overflow; this if judgement is essential for this task
    if idx >= a_in_disk.size:
        return
    # simulate reading something from disk to memory
    time.sleep(0.3)
    a_in_comp[0] = a_in_disk[idx]

def compute_1000ms(a, b):
    """ Compute a+b in 1000ms. """
    time.sleep(1)
    return a + b

Without load optimization, this process can be done in the following naive way:

In [4]:
t0 = time.time()
a_actual = np.array([-1])
for idx in range(5):
    read_300ms(a_actual, idx)
    b = idx
    result = compute_1000ms(a_actual, b)
    t1 = time.time()
    print("Result of a+b:", result)
    print("Time elapsed: {:6.3f} ms".format(t1 - t0))
    t0, t1 = t1, t0

Result of a+b: [0]
Time elapsed:  1.302 ms
Result of a+b: [3]
Time elapsed:  1.301 ms
Result of a+b: [6]
Time elapsed:  1.301 ms
Result of a+b: [9]
Time elapsed:  1.302 ms
Result of a+b: [12]
Time elapsed:  1.301 ms


Basic idea for load optimization is that while performing time-costly computation, we sneekly prefetch the elements that is required in next computation. After load optimization by callable class `lib.call_in_background`:

In [5]:
t0 = time.time()

with lib.call_in_background(read_300ms) as prefetch_a:
    a_prefetch = np.array([-1])
    a_actual = np.array([-1])
    
    read_300ms(a_prefetch, 0)
    for idx in range(5):
        a_prefetch, a_actual = a_actual, a_prefetch
        prefetch_a(a_prefetch, idx+1)
        b = idx
        result = compute_1000ms(a_actual, b)
        t1 = time.time()
        print("Result of a+b:", result)
        print("Time elapsed: {:6.3f} ms".format(t1 - t0))
        t0, t1 = t1, t0

Result of a+b: [0]
Time elapsed:  1.302 ms
Result of a+b: [3]
Time elapsed:  1.001 ms
Result of a+b: [6]
Time elapsed:  1.001 ms
Result of a+b: [9]
Time elapsed:  1.001 ms
Result of a+b: [12]
Time elapsed:  1.001 ms


Note that function `lib.map_with_prefetch` have similar functionality.