# Becoming an Effective Data Science Practitioner 

An excerpt from Jake VanderPlas's *Python Data Science Handbook* says - "Being an effective practitioner of data science is less about memorizing the tool or command you should use for every possible situation, and more about learning to effectively find the information you don't know, whether through a web search engine or another means." 

This can't be further from the truth. Through out my data science journey, I've spent more time researching and mining answers to things I don't know. Web surfing for answers i.e. Googling, is the first thing most people do when they are in need for easy or hard to answer questions. This is most likely the right move for difficult questions. But an ample amount of information can be found in Ipython/Jupyter Notebook. Some example questions: 
- What is in this package I imported? What methods can I call on this object?
- How do I call this function? What arguments and options does it have? 
- What does the source code look like? 


## Documentation using `help()` &  `?` 
Every Python object contains a doc string, a summary of the object and how its used. Python has a `help()` function that can print the information of the object and a short hand version of `?`. Example below: 


In [6]:
help(max)

Help on built-in function max in module builtins:

max(...)
    max(iterable, *[, default=obj, key=func]) -> value
    max(arg1, arg2, *args, *[, key=func]) -> value
    
    With a single iterable argument, return its biggest item. The
    default keyword-only argument specifies an object to return if
    the provided iterable is empty.
    With two or more arguments, return the largest argument.



In [7]:
max?

[0;31mDocstring:[0m
max(iterable, *[, default=obj, key=func]) -> value
max(arg1, arg2, *args, *[, key=func]) -> value

With a single iterable argument, return its biggest item. The
default keyword-only argument specifies an object to return if
the provided iterable is empty.
With two or more arguments, return the largest argument.
[0;31mType:[0m      builtin_function_or_method


`?` works for about anything, including methods, objects, and custom functions.

In [8]:
numbers = [1,2,3,4]
numbers.insert?

[0;31mDocstring:[0m L.insert(index, object) -- insert object before index
[0;31mType:[0m      builtin_function_or_method


In [9]:
numbers?

[0;31mType:[0m        list
[0;31mString form:[0m [1, 2, 3, 4]
[0;31mLength:[0m      4
[0;31mDocstring:[0m  
list() -> new empty list
list(iterable) -> new list initialized from iterable's items


In [12]:
def multiply(x, y):
    '''Return x * y'''
    return x * y 

In [13]:
multiply?

[0;31mSignature:[0m [0mmultiply[0m[0;34m([0m[0mx[0m[0;34m,[0m [0my[0m[0;34m)[0m[0;34m[0m[0m
[0;31mDocstring:[0m Return x * y
[0;31mFile:[0m      ~/Learn.co_DataScience/Personal/Blog_Data_Science/<ipython-input-12-7a2181341da2>
[0;31mType:[0m      function


## Source code using `??`

For simple functions, `??` can be used to provide quick insight into the source code i.e. see what's under-the-hood. It works for functions written in Python. A sample of Pandas function along with our simple multiply function is used. 

In [14]:
multiply??

[0;31mSignature:[0m [0mmultiply[0m[0;34m([0m[0mx[0m[0;34m,[0m [0my[0m[0;34m)[0m[0;34m[0m[0m
[0;31mSource:[0m   
[0;32mdef[0m [0mmultiply[0m[0;34m([0m[0mx[0m[0;34m,[0m [0my[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0;34m'''Return x * y'''[0m[0;34m[0m
[0;34m[0m    [0;32mreturn[0m [0mx[0m [0;34m*[0m [0my[0m[0;34m[0m[0m
[0;31mFile:[0m      ~/Learn.co_DataScience/Personal/Blog_Data_Science/<ipython-input-12-7a2181341da2>
[0;31mType:[0m      function


In [15]:
import pandas as pd 

In [16]:
pd.groupby??

[0;31mSignature:[0m [0mpd[0m[0;34m.[0m[0mgroupby[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0m
[0;31mDocstring:[0m <no docstring>
[0;31mSource:[0m   
[0;32mdef[0m [0mgroupby[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m[0;34m[0m
[0;34m[0m                  [0;34m"Please use the Series.groupby() or "[0m[0;34m[0m
[0;34m[0m                  [0;34m"DataFrame.groupby() methods"[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0;32mreturn[0m [0margs[0m[0;34m[[0m[0;36m0[0m[0;34m][0m[0;34m.[0m[0mgroupby[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m[[0m[0;36m1[0m[0;34m:[0m[0;34m][0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0m
[0;31mFile:[0m      /anaconda3/envs/learn-env/lib/python3.6/site-packages/pandas/core/api.py
[0;31mType:[0m      function


## Tab-Completion 

Another useful item in IPython is the tab-completion. Basically using the tab key to auto complete objects and modules. 
To further narrow down the list you can type the first letter or several letters of the name. 


In [None]:
numbers.c

You can also use tab complete when importing modules and/or objects from packages. We'll use `itertools` package as an example. Also you can use tab-completion to see which imports are available on your system (this is will vary depending on which modules and third-party scripts you're using).

In [None]:
from itertools import c

In [None]:
import 

In [37]:
str.*fi*?

str.find
str.isidentifier
str.rfind
str.zfill