In [None]:
Rule number one: 
only optimize when there is a proven speed bottleneck. Only optimize the innermost loop. 
(This rule is independent of Python, but it doesn't hurt repeating it, since it can save a lot of work. :-)
Small is beautiful. Given Python's hefty charges for bytecode instructions and variable look-up, 
it rarely pays off to add extra tests to save a little bit of work.
 
Use intrinsic operations ====
An implied loop in map() is faster than an explicit for loop;a while loop with an explicit loop counter is even slower.
Avoid calling functions written in Python in your inner loop. This includes lambdas. 
In-lining the inner loop can save a lot of time.
Local variables are faster than globals; if you use a global constant in a loop, copy it to a local variable
before the loop. And in Python, function names (global or built-in) are also global constants!
Try to use map(), filter() or reduce() to replace an explicit for loop, but only if you can use a built-in 
function: map with a built-in function beats for loop, but a for loop with in-line code beats map 
with a lambda function!
Check your algorithms for quadratic behavior. But notice that a more complex algorithm only pays off 
for large N - for small N, the complexity doesn't pay off. In our case, 256 turned out to be small 
enough that the simpler version was still a tad faster. Your mileage may vary - this is worth investigating.
And last but not least: collect data. Python's excellent profile module can quickly show the 
bottleneck in your code. if you're considering different versions of an algorithm, test it in a 
tight loop using the time.clock() function.
 
fun call
lookup
loop

In [4]:
#obtimization from f2 to f7(f7 is best way of writing the code)
def f2(list):
        return reduce(lambda string, item: string + chr(item), list, "")
    
def f1(list):
        string = ""
        for item in list:
            string = string + chr(item)
        return string
    
def f4(list):                           #local variable lookup is faster
        string = ""
        lchr = chr
        for item in list:
            string = string + lchr(item)
        return string
    
def f5(list):
        string = ""
        for i in range(0, 256, 16): # 0, 16, 32, 48, 64, ...
            s = ""
            for character in map(chr, list[i:i+16]):
                s = s + character
            string = string + s
        return string
    
def f3(list):
        string = ""
        for character in map(chr, list):
            string = string + character
        return string
import string
def f6(list):
    return string.joinfields(map(chr, list), "")

import array
def f7(list):
    return array.array('B', list).tostring()

In [None]:
Time taken by For Loop: 2.56907987595 >Time taken by List Comprehension: 2.01556396484(time less but space more)
        
A list comprehension would require more memory to remove items in a list than a normal loop. A list comprehension 
always creates a new list in memory upon completion, so for deletion of items off a list, a new list 
would be created. 

Whereas, for a normal for loop, 
we can use the list.remove() or list.pop() to modify the original list instead 
of creating a new one in memory.

In [None]:
A linked list will allow you to allocate memory as needed. This is possible because the nodes in the linked list can be 
stored in different places in memory 
but come together in the linked list through pointers. This makes linked lists a lot more flexible 
compared to arrays.

In [None]:
If we need to generate a large number of integers for use, xrange should be our go-to option for 
this purpose since it uses less memory. If we use the range function instead, the entire list of 
integers will need to be created and this will get memory intensive.

xrange may consume less memory but takes more time to find an item in it. Given the situation and 
the available resources,we can choose either of range or xrange depending on the aspect we are going for

In [None]:
This is where Python Sets come in. They are like Lists but they do not allow any duplicates to be stored in them. 

Sets are also used to efficiently remove duplicates from Lists and are faster than creating a 
new list and populating it from the one with duplicates.

In [None]:
It is evident that the .join() method is not only neater and more readable, 
but 

it is also significantly faster than the concatenation operator when joining Strings in an iterator.

In [None]:
Pandas.apply(),Pandas.DataFrame.loc,Vectorize your Functions in Python,Multiprocessing in Python

In [None]:
Profile and optimize your existing code
Use a C module (or write your own)
Try a JIT-enabled interpreter like Jython or PyPy
Parallelize your workload

In [None]:
CPU cache size, network buffer size, kernel version, 
operating system, dependency versions, and more can all skew your numbers.

In [None]:
#https://people.duke.edu/~ccc14/sta-663/MakingCodeFast.html
Cost in programmer time
Optimized code is often more complex
Optimized code is oftne less generic

CPU-bound - CPU is working flat out
Memory-bound - Out of RAM - swapping to hard disk
IO-bound - Lots of data transfer to and from hard disk
Network-bound - CPU is waiting for data to come over network or from memory (“starvation”)

Use a better machine (e.g. if RAM is limititg is - buy more RAM)
Solve a simpler problem (e.g. will a subsample of the data suffice?)
Solve a diffrent problem (perhaps solving a toy problem will suffice for your JASA paper? 
If your method is so useful, maybe someone else will optimize it for you)

Using timeit
Using time
Usign cProfile
Using line_profiler
Using memory_profiler

Use better algorithms and data structures
Using compiled code written in another language

Converting Python code to compiled code
Using numexpr
Using numba
Using cython

Parallel programs
Ahmdahl and Gustafsson’s laws
Embarassinlgy parallel problems
Problems requiring communiccation and syncrhonization
Race conditions
Deadlock
Task granularity
Parallel programming idioms

Execute in parallel
On multi-core machines
On multiple machines
Using IPython
Using MPI4py
Using Hadoop/SPARK
On GPUs

In [None]:
#generator
Please keep in mind that you should use this construct only when you don’t have any absolute need to keep all 
the generated values because then you will lose the advantage of having a generator construct.

Use builtin functions and libraries
Use keys for sorts
Optimizing loops
Try multiple coding approaches
Use xrange instead of range:range()  
Use Python multiple assignment to swap variables
Use local variable if possible

In [None]:
#techbeamers.com/python-code-optimization-tips-tricks/
1. Interning Strings for Efficiency.
2. Peephole Optimization.
3. Profile your Code.
3.1. Use Stopwatch Profiling with <timeit>.
3.2. Use Advanced Profiling with <cProfile>.
4. Use Generators and Keys for Sorting.
5. Optimizing Loops.
5.1. Illustrations for Optimizing a for Loop in Python.
5.2. Let’s Decode What have We Optimized?
6. Use Set Operations.
7. Avoid Using Globals.
8. Use External Libraries/Packages.
9. Use Built-in Operators.
10. Limit Method Lookup in a Loop.
11. Optimizing with Strings.
12. Optimizing with If Statement.

RegEx operations in Python are fast as they get pushed back to C code. However, in some cases, 
basic string methods like <isalpha()/isdigit()/startswith()/endswith()> works better.
Also, you can test different methods using the <timeit> module. It’ll help you determine which 
method is truly the fastest.


In [42]:
set(l1)|set(l2) Union	     Set with all l1 and l2 items.
set(l1)&set(l2) Intersection Set with commmon l1 and l2 items.
set(l1)-set(l2) Difference   Set with l1 items not in l2.

In [None]:
1. You can adjust your code to utilize this behavior of Python. For example, 
if you are searching for a fixed pattern in a list, then you can reduce the scope by 
adding the following condition.

Add an ‘AND’ condition which becomes false if the size of the target string is less 
than the length of the pattern.
Also, you can first test a fast condition (if any) like “string should start 
with an @” or “string should end with a dot.”.

2. You can test a condition <like if done is not None> 
which is faster than using <if done != None>

In [None]:
#import should be added  outside the funtion
#exchanging a Python for loop for a C for loop as well as removing most of the function calls

In [None]:
#zip,inumerate,dict