#### Import Variants and Misconceptions

I would like to briefly discuss the various import variants such as:

* `import math`
* `from math import sqrt, abs`
* `from math import *`
* `import math as r_math`
* `from math import sqrt as r_sqrt`

##### import math

* loads the entire module (`math`) in memory if it's not already there
* adds a reference to it in `sys.modules` with a key of `math`
* adds a symbol of the same name (`math`) in our current namespace referencing the `math` object

##### import math as r_math

* loads the entire module (`math`) in memory if it's not already there
* adds a reference to it in `sys.modules` with a key of `math`
* adds the symbol `r_math` to our current namespace referencing the `math` object

##### from math import sqrt

* loads the entire module (`math`) in memory if it's not already there
* adds a reference to it in `sys.modules` with a key of `math`
* adds the symbol `sqrt` to our current namespace referencing the `math.sqrt` function
* it **does not** add the symbol `math` to our current namespace

##### from math import sqrt as r_sqrt

* loads the entire module (`math`) in memory if it's not already there
* adds a reference to it in `sys.modules` with a key of `math`
* adds the symbol `r_sqrt` to our current namespace referencing the `math.sqrt` function
* it **does not** add the symbol `math` to our current namespace

##### from math import *

* loads the entire module (`math`) in memory if it's not already there
* adds a reference to it in `sys.modules` with a key of `math`
* adds symbols for all exported symbols in the `math` module directly to our name space (we'll see how what is exported from a module/package can be controlled using underscores or `__all__` later)
* it **does not** add the symbol `math` to our current namespace

As you can see, in **every** instance, the module is imported and a reference to it is added to `sys.modules`. The variants really have to do with what is injected into our current **namespace**: the module name, an alias to it, just the specified symbols from the module, or all the exported symbols from the module.

#### Misconceptions

This leads to the first misconception:

"You should use

`from math import sqrt, abs`

rather than 

`import math`

because that way you only import what you need and you're not having Python load the entire module?"

For `math` that's just not true. In fact for any *simple* module.

For *packages* that have subpackages, that may or may not be true - we'll see that later.

Let's actually test this out.

We have to be a little careful, because Jupyter imports a ton of modules and packages:

In [1]:
import sys
for key in sorted(sys.modules.keys()):
    print(key)

IPython
IPython.core
IPython.core.alias
IPython.core.application
IPython.core.async_helpers
IPython.core.autocall
IPython.core.builtin_trap
IPython.core.compilerop
IPython.core.completer
IPython.core.completerlib
IPython.core.crashhandler
IPython.core.debugger
IPython.core.display
IPython.core.display_trap
IPython.core.displayhook
IPython.core.displaypub
IPython.core.error
IPython.core.events
IPython.core.excolors
IPython.core.extensions
IPython.core.formatters
IPython.core.getipython
IPython.core.history
IPython.core.hooks
IPython.core.inputtransformer2
IPython.core.interactiveshell
IPython.core.latex_symbols
IPython.core.logger
IPython.core.macro
IPython.core.magic
IPython.core.magic_arguments
IPython.core.magics
IPython.core.magics.auto
IPython.core.magics.basic
IPython.core.magics.code
IPython.core.magics.config
IPython.core.magics.display
IPython.core.magics.execution
IPython.core.magics.extension
IPython.core.magics.history
IPython.core.magics.logging
IPython.core.magics.namespac

so they're already loaded and in the `sys.modules` dictionary.


Fortunately `cmath` is not one of them, so we'll use that one.

In [2]:
'cmath' in sys.modules

False

Let's go ahead and just import a single symbol from `cmath`, the `exp` function:

In [3]:
from cmath import exp

Now let's see if `cmath` and `exp` are in our module (global) namespace:

In [4]:
'cmath' in globals()

False

In [5]:
'exp' in globals()

True

OK, so basically what that import did was create a symbol for `exp` in our namespace, but not for `cmath`.

Does this mean that `cmath` was only "partially" loaded?

How can Python "partially" load a (simple) module? How would it even know what to load up? Sure, maybe it could do some fancy kind of introspection and determine all the dependencies the symbols we are importing require. But it does not.

It simply imports the entire module (using the techniques we have been covering in the last few videos)

If we really want to partially load something, we would use a package, which, while still a `module` type, can be composed of several sub-packages. More on that later.

In, fact let's look at it in `sys.modules`:

In [6]:
sys.modules['cmath']

<module 'cmath' (built-in)>

Yep, it's there...

We can even get a handle to the `cmath` module:

In [7]:
cmath = sys.modules['cmath']

In [8]:
cmath

<module 'cmath' (built-in)>

And now we can use `cmath` just as if we had done 

`import cmath`

But you'll note that in this case we did not import the module, we did `from cmath import exp` only.

So we can use `exp` directly because of how we imported that specific symbol:

In [9]:
exp(2+3j)

(-7.315110094901103+1.0427436562359045j)

But we can also use the `cmath` module directly now that we retrieved it from `sys.modules`:

In [10]:
cmath.sqrt(1+1j)

(1.09868411346781+0.45508986056222733j)

So, the **entire** `cmath` module was loaded when we ran `from cmath import exp`, not just a portion of it!

The only thing that happened is that Python put `cmath` in `sys.modules`, but **did not** add a `cmath` symbol to our module namespace, and **only added** the function `exp` to our namespace.

What about doing something like this:

`from cmath import *`

This is often frowned upon, and sometimes for good reason - but this is not a universal truth either.

Let's see why, in our current context, it's maybe not such a good thing.

First let's see what our global namespace looks like:

In [11]:
globals()

{'In': ['',
  'import sys\nfor key in sorted(sys.modules.keys()):\n    print(key)',
  "'cmath' in sys.modules",
  'from cmath import exp',
  "'cmath' in globals()",
  "'exp' in globals()",
  "sys.modules['cmath']",
  "cmath = sys.modules['cmath']",
  'cmath',
  'exp(2+3j)',
  'cmath.sqrt(1+1j)',
  'globals()'],
 'Out': {2: False,
  4: False,
  5: True,
  6: <module 'cmath' (built-in)>,
  8: <module 'cmath' (built-in)>,
  9: (-7.315110094901103+1.0427436562359045j),
  10: (1.09868411346781+0.45508986056222733j)},
 '_': (1.09868411346781+0.45508986056222733j),
 '_10': (1.09868411346781+0.45508986056222733j),
 '_2': False,
 '_4': False,
 '_5': True,
 '_6': <module 'cmath' (built-in)>,
 '_8': <module 'cmath' (built-in)>,
 '_9': (-7.315110094901103+1.0427436562359045j),
 '__': (-7.315110094901103+1.0427436562359045j),
 '___': <module 'cmath' (built-in)>,
 '__builtin__': <module 'builtins' (built-in)>,
 '__builtins__': <module 'builtins' (built-in)>,
 '__doc__': 'Automatically created module

Now let's do that import:

In [12]:
from cmath import *

And let's see our namespace now:

In [13]:
globals()

{'In': ['',
  'import sys\nfor key in sorted(sys.modules.keys()):\n    print(key)',
  "'cmath' in sys.modules",
  'from cmath import exp',
  "'cmath' in globals()",
  "'exp' in globals()",
  "sys.modules['cmath']",
  "cmath = sys.modules['cmath']",
  'cmath',
  'exp(2+3j)',
  'cmath.sqrt(1+1j)',
  'globals()',
  'from cmath import *',
  'globals()'],
 'Out': {2: False,
  4: False,
  5: True,
  6: <module 'cmath' (built-in)>,
  8: <module 'cmath' (built-in)>,
  9: (-7.315110094901103+1.0427436562359045j),
  10: (1.09868411346781+0.45508986056222733j),
  11: {...}},
 '_': {...},
 '_10': (1.09868411346781+0.45508986056222733j),
 '_11': {...},
 '_2': False,
 '_4': False,
 '_5': True,
 '_6': <module 'cmath' (built-in)>,
 '_8': <module 'cmath' (built-in)>,
 '_9': (-7.315110094901103+1.0427436562359045j),
 '__': (1.09868411346781+0.45508986056222733j),
 '___': (-7.315110094901103+1.0427436562359045j),
 '__builtin__': <module 'builtins' (built-in)>,
 '__builtins__': <module 'builtins' (built-i

Some people say the namespace was "polluted". In a way I guess that's true, but it does mean I can now access **all** attributes in `cmath` without prefixing them with `cmath` all the time: 

In [14]:
sqrt(2+2j)

(1.5537739740300374+0.6435942529055826j)

In [15]:
pi

3.141592653589793

In [16]:
sin(2-3j)

(9.15449914691143+4.168906959966565j)

In and of itself, there's nothing wrong with that...

But a couple of issues:

The first one is that when I call `sin` just like that, someone reading my code does not immediately know where that function came from. Was it a function I implemented in my module? some other custom module? the `cmath` module? the `math` module?

The second one is that you can run into serious problems if you also need to import the `math` module:

Currently the `sqrt` symbol is the `cmath.sqrt` function:

In [17]:
sqrt

<function cmath.sqrt>

In [18]:
from math import *

What just happened to the `sqrt` function that was in our namespace?

In [19]:
sqrt

<function math.sqrt>

As you can see, the symbol `sqrt` in our namespace no longer refers to the `sqrt` function in `cmath` but rather to the one in `math`.

It just got replaced by the `sqrt` function in the `math` module because it has the same name (`sqrt`).

This is one of the reasons why `from ... import *` is sometimes frowned upon.

But the same problem can happen if you use a `from` import this way:

In [20]:
from cmath import sqrt
from math import sqrt

Same thing happened here, the `math.sqrt` function just clobbered the `cmath.sqrt` function.

One option here is to use:

In [21]:
import cmath
import math

In [22]:
math.sqrt(2)

1.4142135623730951

In [23]:
cmath.sqrt(2+2j)

(1.5537739740300374+0.6435942529055826j)

But Python also allows us to alias our imports using the `as` keyword.

We can alias either the entire module, or just the symbols being imported from the module:

In [24]:
import math as r_math
import cmath as c_math

In [25]:
r_math

<module 'math' (built-in)>

In [26]:
c_math

<module 'cmath' (built-in)>

In [27]:
r_math.sqrt(2)

1.4142135623730951

In [28]:
c_math.sqrt(2)

(1.4142135623730951+0j)

By the way, this is the **exact** same result as doing:

In [29]:
import importlib

In [30]:
r_math = importlib.import_module('math')
c_math = importlib.import_module('cmath')

In [31]:
r_math

<module 'math' (built-in)>

In [32]:
c_math

<module 'cmath' (built-in)>

We can also alias symbols from the imported module:

In [33]:
from math import sqrt as r_sqrt
from cmath import sqrt as c_sqrt

In [34]:
r_sqrt

<function math.sqrt>

In [35]:
c_sqrt

<function cmath.sqrt>

Again, we can reproduce this using the following:

In [36]:
r_sqrt = importlib.import_module('math').sqrt
c_sqrt = importlib.import_module('cmath').sqrt

In [37]:
r_sqrt

<function math.sqrt>

In [38]:
c_sqrt

<function cmath.sqrt>

At the end of the day, the module is always loaded and cached (`sys.modules`), these different variants of the `import` statement merely determine what symbols are added to our module (global) namespace. That's it.

It's a little different for packages as we'll see later.

#### Efficiency

The final thing we need to look at is often mentioned in various blog posts and online discussions.

`import variant #1` is more "efficient" than `import variant #2`

Maybe so, but realistically by how much?

Or even how the following is terribly wrong because it re-imports the `math` module **every** time `my_func` is called:

In [39]:
def my_func(a):
    import math
    return math.sqrt(a)

From a readability standpoint, yes, that is **not** a good idea. Much better to put all your imports at the top of the module once in a location where any reader can easily see all your module dependencies.

But as far as reloading the module, you should now understand that's absolutely not true. Instead, it has to do a dictionary lookup in the `sys.modules` dictionary, not reload the entire module after the first load has occurred!

Dictionary lookups are blazingly fast in Python - so, yes, there is some overhead, but not as much as you may think.

So, let's write some timing code to test these things and see how they compare.

We shoudl consider both relative speed differences as well as absilute speed differences.

If you try to optimize your code and end up reducing that code's speed by 50% that sounds good. But what if the original code ran in `1`s. Now it runs in `0.5`s. How long does the total program run? Down from `30`s to `29.5`s? Things are relative...

In [40]:
from time import perf_counter

Yes, I'm using a `from` import - for readability and typing reasons. How many other modules are out there where I run the risk of clobbering `perf_counter`? I can't think of one. Certainly not in any imports I'm going to be using here. It's such a unique name, I feel pretty safe!

I'm also going to write a small utility function that compares two timings to each other:

In [41]:
from collections import namedtuple

Timings = namedtuple('Timings', 'timing_1 timing_2 abs_diff rel_diff_perc')
def compare_timings(timing1, timing2):
    rel_diff = (timing2 - timing1)/timing1 * 100
    
    timings = Timings(round(timing1, 1),
                     round(timing2, 1),
                     round(timing2 - timing1, 2),
                     round(rel_diff, 2))
    return timings

##### Timing using fully qualified `module.symbol` 

In [42]:
test_repeats = 10_000_000

In [43]:
import math

start = perf_counter()
for _ in range(test_repeats):
    math.sqrt(2)
end = perf_counter()
elapsed_fully_qualified = end - start
print(f'Elapsed: {elapsed_fully_qualified}')

Elapsed: 2.057656398357829


##### Timing using a directly imported symbol name:

In [44]:
from math import sqrt

start = perf_counter()
for _ in range(test_repeats):
    sqrt(2)
end = perf_counter()
elapsed_direct_symbol = end - start
print(f'Elapsed: {elapsed_direct_symbol}')

Elapsed: 1.603430354697538


Let's see the relative and absolute time differences:

In [45]:
compare_timings(elapsed_fully_qualified, elapsed_direct_symbol)

Timings(timing_1=2.1, timing_2=1.6, abs_diff=-0.45, rel_diff_perc=-22.07)

Definitely faster - but in absolute terms I really did not save a whole lot - over `10,000,000` iterations!

##### Timing using a function (fully qualified symbol)

In [46]:
import math

def func():
    math.sqrt(2)
    
start = perf_counter()
for _ in range(test_repeats):
    func()
end = perf_counter()
elapsed_func_fully_qualified = end - start
print(f'Elapsed: {elapsed_func_fully_qualified}') 

Elapsed: 3.2668947610088703


In [47]:
compare_timings(elapsed_fully_qualified, elapsed_func_fully_qualified)

Timings(timing_1=2.1, timing_2=3.3, abs_diff=1.21, rel_diff_perc=58.77)

That was slower because of the function call overhead, but not by much in absolute terms considering I called `func()` `10,000,000` times!

##### Timing using a function (direct symbol)

In [48]:
from math import sqrt

def func():
    sqrt(2)
    
start = perf_counter()
for _ in range(test_repeats):
    func()
end = perf_counter()
elapsed_func_direct_symbol = end - start
print(f'Elapsed: {elapsed_func_direct_symbol}')

Elapsed: 2.80123663975316


In [49]:
compare_timings(elapsed_func_fully_qualified, elapsed_func_direct_symbol)

Timings(timing_1=3.3, timing_2=2.8, abs_diff=-0.47, rel_diff_perc=-14.25)

Slower, but again not by much in absolute terms considering this was for `10,000,000` iterations.

##### Timing using a nested import (fully qualified symbol)

In [50]:
def func():
    import math
    math.sqrt(2)
    
start = perf_counter()
for _ in range(test_repeats):
    func()
end = perf_counter()
elapsed_nested_fully_qualified = end - start
print(f'Elapsed: {elapsed_nested_fully_qualified}')

Elapsed: 5.041648347331877


In [51]:
compare_timings(elapsed_func_fully_qualified, elapsed_nested_fully_qualified)

Timings(timing_1=3.3, timing_2=5.0, abs_diff=1.77, rel_diff_perc=54.33)

So definitely slower. But in absolute terms, for `10,000,000` iterations?

##### Timing using a nested import (direct symbol)

In [52]:
def func():
    from math import sqrt
    sqrt(2)
    
start = perf_counter()
for _ in range(test_repeats):
    func()
end = perf_counter()
elapsed_nested_direct_symbol = end - start
print(f'Elapsed: {elapsed_nested_direct_symbol}')

Elapsed: 14.60262281403945


In [53]:
compare_timings(elapsed_nested_fully_qualified, elapsed_nested_direct_symbol)

Timings(timing_1=5.0, timing_2=14.6, abs_diff=9.56, rel_diff_perc=189.64)

That was significantly slower! Even in absolute terms this is starting to get sloooow.

So does this mean you should put imports inside functions?

No, of course not - follow the convention, it makes code far more readable, and of course optimize your code only once you have identified the bottlenecks. 

Does this mean you shouldn't care at all about the performance of your code based on the import variants?

Again, of course not - you absolutely should.

But, there is absolutely no reason to re-write your code from 

`import math
math.sqrt(2)`

to 

`from math import sqrt
sqrt(2)
`

for **speed** reasons if during the entire lifetime of your application you only call that function `100` times... or `10,000,000` times.

Really depends on your circumstance - be aware of it, but don't try to optimize code until you know **where** you **need** to optimize!

*[I've seen people refactor parts of their code for sub-second improvements, when, in fact, the largest bottleneck was that they were opening and closing database connections at every read and write instead of pooling connections or something like that]*

And

`from module import *`

has its uses as we'll see later when we discuss packages.

It's not evil, just not very safe - again depends on your circumstance.