<a class="anchor" id="0"></a>
# **15 Data Science hacks to speed up analysis**


Prashant Banerjee

November 2020

Hello friends,


In this notebook, I will share simple Data Science hacks that will make your life easier. They will help you to write your code efficiently and save you lots of time. I hope you will enjoy these hacks. I have provided plenty of links to gain more information about these hacks at appropriate places.

So, let's dive in.

<a class="anchor" id="0.1"></a>
# **Table of Contents**


- [1. Import necessary DS libraries using Pyforest ](#1)
- [2. Jupyter Magic Commands](#2)
- [3. rpy2: R and Python in the same Jupyter Notebook](#3)
- [4. Jupyter Widgets](#4)
- [5. Jupyter Themes](#5)
- [6. Sharing Jupyter Notebooks](#6)
- [7. Jupyter Notebook - Keyboard Shortcuts](#7)
- [8. Pretty Display of Variables](#8)
- [9. Quick Access to Documentation](#9)
- [10.Visualize images with IPython](#10)
- [11.Visualize html files, external sites and LaTex with IPython](#11)
- [12.Python Debugging with Pdb](#12)
- [13.Suppress output of a final function](#13)
- [14.Speed up EDA with Pandas Profiling](#14)
- [15.Cross tabulation with Pandas Crosstab() function](#15)

    
   

# **1. Import necessary DS libraries using Pyforest**<a class="anchor" id="1"></a>

[Table of Contents](#0.1)

- Generally, the Data Science work will start by importing the required Python libraries (assuming that you write your code in Python). Many different libraries like `pandas`,`numpy`,`matplotlib`,`seaborn` or `sklearn` are required to do the work. To import these libraries, different lines of codes are required as follows:-

#### **import pandas, matplotlib, seaborn and numpy**

-`import pandas as pd`

-`import numpy as np`

-`import matplotlib.pyplot as plt`

-`import seaborn as sns`


- Now, everytime before you can start with the actual work, you will have to import your libraries by writing the above lines of code. This is a boring and tedious task as we have to write the same lines of code again and again.

- So, what is the solution ? 


## **The solution is Pyforest**

###### Pyforest offers the following solution as follows

- We can use all our libraries like we usually do. If a library is not imported yet, pyforest will import it first and add the import statement to the first Jupyter cell.

- Only those libraries are imported which are actually required.

- We don't have to waste time on imports.


###### But, first of all we have to install pyforest 

## **Pyforest Installation.**


- Terminal installation of pyforest.

In [None]:
pip install pyforest

- This new library **Pyforest** can reduce the set of above codes to one single code. Now, we can import most of the libraries using a single line of code of Pyforest.

- Single line of code to import all python libraries is as follows:-

In [None]:
from pyforest import *

## **Using Pyforest**

- Now, we can use our favorite Python Data Science commands without importing the python libraries.

- For example, if you want to read a CSV file with pandas, we can do it as follows:-

In [None]:
df = pd.read_csv("../input/titanic/train_and_test2.csv")

In [None]:
sns.distplot(df.Age)

Let's check the libraries that are imported.

In [None]:
active_imports()

- We can see that, we have used pandas and seaborn without actually importing them. That's how we use **Pyforest.**

- Source : https://github.com/8080labs/pyforest

# **2. Jupyter Magic Commands**<a class="anchor" id="2"></a>

[Table of Contents](#0.1)


- I am sure all of us know about **Jupyter Notebooks** and **Jupyter Magic Commands**. This section is just a bit of recap. In case, you have not heard about it, you will find this section very important and it will increase your productivity.

- Jupyter notebook is formerly known as **IPython notebook**. It is a flexible tool, that helps to create powerful machine learning analysis. They are based on IPython kernel and so it has access to all the magics from the IPython kernel.

- We can list all the magic commands as follows:-

In [None]:
# the following command will list all magic commands

%lsmagic

- You can check the documentation for [Built-in magic commands](https://ipython.readthedocs.io/en/stable/interactive/magics.html) for complete reference.

- Below, we will discuss several important magic commands. 

- But first let's discuss the types of magic commands.

## **2.1 Types of Magic Commands**<a class="anchor" id="2.1"></a>


There are two types of magic commands −

- 1. Line magics
- 2. Cell magics

### **2.1.1 Line Magics**<a class="anchor" id="2.1.1"></a>

- They are similar to command line calls. They start with the **% character**. 

- **% prefix** represents that the command operates over a single line of code. 

- Rest of the line is its argument passed without parentheses or quotes. 

- Line magics can be used as expression and their return value can be assigned to a variable.

### **2.1.2 Cell Magics**<a class="anchor" id="2.1.2"></a>

- They have **%% character** prefix. Unlike line magic functions, they can operate on multiple lines below their call. 

- **%% prefix** allows the command to operate over an entire cell.

- They can in fact make arbitrary modifications to the input they receive, which need not even be a valid Python code at all. They receive the whole block as a single string.

- To know more about magic functions, the built-in magics and their docstrings, we can use the magic command suffixed by a **?**.
- Information of a specific magic function is obtained by **%magicfunction?** command. 
- For example, we can obtain information about the magic command **%time** as follows:- 

In [None]:
%time?

## **2.2 Commonly used Magic Commands**<a class="anchor" id="2.2"></a>


- Now, I will discuss two of the most commonly used magic commands as given below:-

### **2.2.1 %matplotlib inline and %matplotlib notebook** <a class="anchor" id="2.2.1"></a>


### **%matplotlib inline**

- **%matplotlib inline** magic command is the most popular command. This command allows Jupyter notebook to display matplotlib graphs in notebooks. This command activates matplotlib interactive support for the jupyter notebooks.

- **%matplotlib inline** sets the backend of matplotlib to the `inline` backend. With this backend, the output of plotting commands is displayed inline within frontends like the Jupyter notebook, directly below the code cell that produced it. The resulting plots will then also be stored in the notebook document.

- The **%matplotlib inline** magic command allows us to visualize graph inside jupyter notebook.

- **%matplotlib inline** - only draw static images in the notebook.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.plot([1,1.6,3])

- A backend should be set before importing pyplot in jupyter. 
- Or in other words, after changing the backend, pyplot needs to be imported again.
- Therefore call **%matplotlib ...** prior to importing pyplot.

### **%matplotlib notebook**


- **%matplotlib notebook** will lead to interactive plots embedded within the notebook. 
- We can zoom and resize the figure as shown below.

In [None]:
%matplotlib notebook
%matplotlib notebook
#calling it a second time may prevent some graphics errors
import matplotlib.pyplot as plt
plt.plot([1,1.6,3])

### **Notes on sizes**

- notebook and ipympl create resize-able plots (suposedly, we actually am missing resizing in ipympl).

- When using %matplotlib inline or mpld3 you should adjust figure size using the code below-

- `plt.rcParams['figure.figsize'] = [9.5, 6] `

- The params are in the order width then height…

- Please see the Matplotliob documentation on backends for more information
- Source : 
- https://matplotlib.org/tutorials/introductory/usage.html#backends
- https://stackoverflow.com/questions/3285193/how-to-change-backends-in-matplotlib-python
- https://stackoverflow.com/questions/43545050/using-matplotlib-notebook-after-matplotlib-inline-in-jupyter-notebook-doesnt

### **2.2.2 %time, %%time and %timeit** <a class="anchor" id="2.2.2"></a>


- The above magic commands tell us how much time our code needs to run.

- **%%time** is a very useful cell magic command.

- It is a fast way to benchmark our code and to know how much time is required by the code to run.

- **%%time** will give us information about a single run of the code in the cell as follows:-

In [None]:
%%time
import time
for _ in range(1000):
 time.sleep(0.01) # sleep for 0.01 seconds


- **%timeit** ( or cell version **%%timeit**) uses the Python [timeit module](https://docs.python.org/3.5/library/timeit.html) which runs a statement 100,000 times (by default) and then provides the mean of the fastest three times.

In [None]:
import numpy

%timeit numpy.random.normal(size=100)

### **Other useful magic commands**

- Below is a table of other useful Jupyter notebook magic commands.

- Magic command	-  Result
- **%pwd**      -  Print the current working directory
- **%cd**	    -  Change the current working directory
- **%ls**       -  List the contents of the current directory
- **%history**  -  Show the history of the In [ ]: commands

# **3. rpy2: R and Python in the same notebook** <a class="anchor" id="3"></a>

[Table of Contents](#0.1)


- We all know about Python and R. They are the most popular open source programming languages preferred by data scientists all over the world.

- R is primarily used for statistical data analysis while Python is the first choice for first time programmers for easy to use code.

- We can use both of them in a single Jupyter Notebook. 

- We just need to install [rpy2](https://pypi.org/project/rpy2/). It can be easily done with the pip command as follows:-
  - `pip install rpy2`
  

- Or, we can use the following command:-

  - `conda install rpy2`



- We can then use the two languages together, and even pass variables between them.

- See the [rpy2 documentation](https://rpy2.github.io/doc/latest/html/index.html) for more information.

# **4. Jupyter Widgets**<a class="anchor" id="4"></a>

[Table of Contents](#0.1)

- The Jupyter Notebook has a feature known as **widgets**. In the Jupyter Notebook, we can create sliders, buttons, text boxes and much more. So, they are basically the controls that make up the user interface. 

- We can see some pre-made widgets by going to the following url:

  http://jupyter.org/widgets
  
  
- Formally, a widget can be defined as follows:-

  - *A widget is used to create an interactive graphical user interface for your user. The widgets synchronize stateful and stateless information between Python and Javascript.*

- So, a widget is an `eventful python object` that in the case of Jupyter Notebook, resides in the browser and is a user interface element, such as a slider or textbox. 

- Jupyter supports a fairly wide array of widgets including the following:
  - Numeric
  - Boolean
  - Selection
  - String
  - Image
  - Button
  - Output
  - Animation
  - Date picker
  - Color picker
  - Controller (i.e. game controller)
  - Layout
  
- For a full list on widgets, we can check out the [Widget List](https://ipywidgets.readthedocs.io/en/stable/examples/Widget%20List.html) documentation.

- We can run the following code in our notebook to see the complete list of widgets.

In [None]:
import ipywidgets as widgets
print(dir(widgets))

## **4.1 Creation of Widgets**<a class="anchor" id="4.1"></a>

### **4.1.1 Slider Widget**<a class="anchor" id="4.1.1"></a>


- There are a number of methods for creating widgets in Jupyter Notebook. 

- The first and easiest method is by using the interact function from **ipywidgets.interact** which will automatically generate user interface controls (or widgets) that we can then use to explore our code and interact with data.

- We can create a simple slider with the following lines of code.

In [None]:
from ipywidgets import interact
print(type(interact))

In [None]:
def my_function(x):
    return x
# create a slider
interact(my_function, x=20)

- In the above code block, we import the interact class from ipywidgets. Then we create a simple function called my_function that accepts a single argument and then returns it. Finally we instantiate interact by passing it a function along with the value that we want interact to pass to it. Since we passed in an integer (i.e. 20), the interact class will automatically create a slider.

- We can move the slider around with our mouse. We can see that the slider updates interactively and the output from the function is also automatically updated.

### **4.1.2 Checkboxes**<a class="anchor" id="4.1.2"></a>


- We can add a new cell in the Jupyter Notebook with the following code:

In [None]:
interact(my_function, x=True)

- We can play around with this widget as well by just checking and un-checking the checkbox. We will see its state change and the output from the function call will also get printed on-screen.

### **4.1.3 Textboxes**<a class="anchor" id="4.1.3"></a>

- We can create textboxes with the following lines of code.

In [None]:
interact(my_function, x='Jupyter Notebook!')

- When we run this code, we can find that interact generates a textbox with the string we passed in as its value.

## **4.2 Closing of Widgets**<a class="anchor" id="4.2"></a>

- We can close a widget by calling its `close()` method. If we want to remove the widget, just clear the cell.

- Please see the following documentation for more information on jupyter widgets.

  - https://www.blog.pythonlibrary.org/2018/10/24/working-with-jupyter-notebook-widgets/
  
  - https://www.blog.pythonlibrary.org/2018/10/23/creating-jupyter-notebook-widgets-with-interact/

# **5. Jupyter Themes**<a class="anchor" id="5"></a>

[Table of Contents](#0.1)


- Jupyter themes are a great way to beautify our notebooks and get a dark mode, which is popular among programmers.

- Themes are the combination of background colour,style, text etc. So, they can be used to change not only background colour but also the style of the text. Besides, we can also customise the text in markdown, pandas’ dataframe font size, cell width and height, cursor colour, visibility of toolbar and more. 

- Furthermore, we can set the plotting style with `jtplot.style()` to beautify our visualisations generated with matplotlib. One can also mould figure properties like grid, spines and more.



- First of all, we need to install [jupyterthemes](https://github.com/dunovank/jupyter-themes) with pip:

In [None]:
pip install jupyterthemes

- The available themes are given below:-

  - onedork
  - grade3
  - oceans16
  - chesterish
  - monokai
  - solarizedl
  - solarizedd
  
 - Screens of the available themes are also available in the [Github repository](https://github.com/dunovank/jupyter-themes).

In [None]:
# available themes can be shown with the following command
!jt -l

- Now, we can set the theme with the following command:

  - `jt -t <theme-name>`
  
- I am not running this command, as it will change the theme.

- We can restore the default theme with the following command

   - `jt -r`

# **6. Sharing Jupyter Notebooks**<a class="anchor" id="6"></a>

[Table of Contents](#0.1)


- The easiest way to share the notebook is simply using the notebook file (.ipynb). There are other options also available.

- Convert notebooks to html files using the File > Download as > HTML Menu option.
- Upload your .ipynb file to [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb).
- Share the notebook file with [gists](https://gist.github.com/) or on github, both of which render the notebooks. 
- Store the notebook e.g. in dropbox and put the link to [nbviewer](https://nbviewer.jupyter.org/). nbviewer will render the notebook from whichever source you host it.
- Use the File > Download as > PDF menu to save your notebook as a PDF.

# **7. Jupyter Notebook - Keyboard Shortcuts**<a class="anchor" id="7"></a>

[Table of Contents](#0.1)


- Keyboard shortcuts will save us lots of time. 
- Jupyter stores a list of keybord shortcuts under the menu at the top: **Help > Keyboard Shortcuts**, or by pressing **H** in command mode.
- Another way to access keyboard shortcuts is to use the command palette: `Cmd + Shift + P (or Ctrl + Shift + P on Linux and Windows)`. 
- This dialog box helps us to run any command by name. It is useful if we don’t know the keyboard shortcut for an action or if we want to do does not have a keyboard shortcut. 

### **Some of the commonly used shortcuts are given below:**

- `Esc` will take us into the command mode where we can navigate around the notebook with arrow keys.
- While in command mode:
  - `A` to insert a new cell above the current cell, `B` to insert a new cell below.
  - `M` to change the current cell to Markdown, `Y` to change it back to code.
  - `D + D` (press the key twice) to delete the current cell.
- `Enter` will take you from command mode back into edit mode for the given cell.
- `Shift + Tab` will show you the Docstring (documentation) for the the object you have just typed in a code cell – you can keep pressing this short cut to cycle through a few modes of documentation.
- `Ctrl + Shift + -` will split the current cell into two from where your cursor is.
- `Esc + F` Find and replace on your code but not the outputs.
- `Esc + O` Toggle cell output.
- Select Multiple Cells:
  - `Shift + J` or `Shift + Down` selects the next sell in a downwards direction. We can also select cells in an upwards direction by using `Shift + K` or `Shift + Up`.
  - Once cells are selected, we can then delete / copy / cut / paste / run them as a batch. This is helpful when you need to move parts of a notebook.
  - We can also use `Shift + M` to merge multiple cells.

- Please see the [Jupyter Notebook Cheat Sheet](https://www.edureka.co/blog/wp-content/uploads/2018/10/Jupyter_Notebook_CheatSheet_Edureka.pdf) for more information.

- You can also see this [Cheatography](https://cheatography.com/weidadeyue/cheat-sheets/jupyter-notebook/pdf_bw/) on the shortcuts.

- For a full list of keyboard shortcuts, click the help button, then the keyboard shortcuts button.

# **8. Pretty Display of Variables**<a class="anchor" id="8"></a>

[Table of Contents](#0.1)


- By finishing a Jupyter cell with the name of a variable or unassigned output of a statement, Jupyter will display that variable without the need for a print statement. This is especially useful when dealing with Pandas DataFrames, as the output is neatly formatted into a table.

In [None]:
pip install pydataset

In [None]:
from pydataset import data
quakes = data('quakes')
quakes.head()
quakes.tail()

- We can see that only the last statement has been executed. We can change this behaviour of Jupyter notebooks.

- We can alter or modify the `ast_note_interactivity` kernel option to make Jupyter do this for any variable or statement on its own line. So, we can see the value of multiple statements at once.

In [None]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

In [None]:
from pydataset import data
quakes = data('quakes')
quakes.head()
quakes.tail()

- We can see that both the statements have been executed.

# **9. Quick Access to Documentation**<a class="anchor" id="9"></a>

[Table of Contents](#0.1)


- We can search for the online documentation for common libraries including NumPy, Pandas, SciPy and Matplotlib. That's a time taking task.

- Luckily, we have a shortcut for that. By prepending a library, method or variable with ?, we can access the docstring for help or quick reference on syntax.

- For example, we can access the documentation of numpy as follows:-

In [None]:
?numpy

# **10. Visualize images with IPython**<a class="anchor" id="10"></a>

[Table of Contents](#0.1)


- In Python, objects can declare their textual representation using the `__repr__` method. 
- IPython expands on this idea and allows objects to declare other, richer representations including:

  - HTML
  - JSON
  - PNG
  - JPEG
  - SVG
  - LaTeX
  
- A single object can declare some or all of these representations; all are handled by IPython's display system. 
- In this section, we will discuss how you can use this display system to incorporate a broad range of content into your Notebooks.


## **10.1 Basic display imports**<a class="anchor" id="10.1"></a>

- The `display` function is a general purpose tool for displaying different representations of objects. 

- We can think of it as print for these rich representations.

In [None]:
from IPython.display import display

- Important points:

  - Calling display on an object will send all possible representations to the Notebook.
  - These representations are stored in the Notebook document.
  - In general the Notebook will use the richest available representation.
  
- If ywe want to display a particular representation, there are specific functions for that:

In [None]:
from IPython.display import display_pretty, display_html, display_jpeg, display_png,\
                            display_json, display_latex, display_svg

## **10.2 Images**<a class="anchor" id="10.2"></a>


- To work with images (JPEG, PNG), we can use the Image class as follows:-

In [None]:
from IPython.display import Image

In [None]:
Image(url='http://python.org/images/python-logo.gif')

- For more information on visualize images with IPython, please visit the following links:-


- [IPython's Rich Display System](https://nbviewer.jupyter.org/github/ipython/ipython/blob/1.x/examples/notebooks/Part%205%20-%20Rich%20Display%20System.ipynb)

- [IPython API - Module : display](https://ipython.readthedocs.io/en/stable/api/generated/IPython.display.html)

- [How to display objects as images in IPython](https://www.tjelvarolsson.com/blog/how-to-display-objects-as-images-in-ipython/)

# **11. Visualize html files, external sites and LaTeX with IPython** <a class="anchor" id="11"></a>

[Table of Contents](#0.1)

## **11.1 HTML files** <a class="anchor" id="11.1"></a>


- Python objects can declare HTML representations that will be displayed in the Notebook. If we have some HTML files, we want to display, simply use the HTML class.

In [None]:
from IPython.display import HTML

In [None]:
s = """<table>
<tr>
<th>Header 1</th>
<th>Header 2</th>
</tr>
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>"""

In [None]:
h = HTML(s); 

h

- Pandas makes use of this capability to allow DataFrames to be represented as HTML tables.

## **11.2 External sites** <a class="anchor" id="11.2"></a>


- We can even embed an entire page from another site in an iframe.

- For example, this is today's Wikipaedia main page.


In [None]:
from IPython.display import HTML
HTML('<iframe src=https://en.wikipedia.org/wiki/Wikipedia width=700 height=350></iframe>')

## **11.3 LaTeX** <a class="anchor" id="11.3"></a>


- The display of mathematical expressions typeset in [LaTeX](https://www.latex-project.org/about/) has also been supported. 
- It is possible due to the [MathJax](https://www.mathjax.org/) library.

In [None]:
from IPython.display import Math
Math(r'F(k) = \int_{-\infty}^{\infty} f(x) e^{2\pi i k} dx')

- With the `Latex` class, we have to include the delimiters ourselves. This allows us to use other LaTeX modes such as eqnarray:

In [None]:
from IPython.display import Latex
Latex(r"""\begin{eqnarray}
\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\
\nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\
\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\
\nabla \cdot \vec{\mathbf{B}} & = 0 
\end{eqnarray}""")

- Or we can enter latex directly with the %%latex cell magic:

In [None]:
%%latex
\begin{aligned}
\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\
\nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\
\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\
\nabla \cdot \vec{\mathbf{B}} & = 0
\end{aligned}

- Please see the [notebook](https://nbviewer.jupyter.org/github/ipython/ipython/blob/1.x/examples/notebooks/Part%205%20-%20Rich%20Display%20System.ipynb) for more information.

# **12. Python Debugging with pdb**<a class="anchor" id="12"></a>

[Table of Contents](#0.1)


- Python's standard library contains [pdb module](https://docs.python.org/3.5/library/pdb.html) which is a set of utilities for debugging of Python programs. We can view a [list of accepted commands](https://docs.python.org/3.5/library/pdb.html#debugger-commands) for pdb [here](https://docs.python.org/3.5/library/pdb.html#debugger-commands).

- The debugging functionality is defined in a Pdb class. The module internally makes used of bdb and cmd modules.

- The pdb module has a very convenient command line interface. It is imported at the time of execution of Python script by using –m switch

- For more information on Python Debugging with pdb, see the official documentation - [pdb-The Python Debugger](https://docs.python.org/3.5/library/pdb.html)

# **13. Suppress output of a final function**<a class="anchor" id="13"></a>

[Table of Contents](#0.1)


- Sometimes, we want to suppress the output of the function on a final line.
- It is useful when plotting.
- To do this, we just need to add a semicolon at the end.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 1, 1000)**1.5
plt.hist(x)

In [None]:
# By adding a semicolon at the end, the output is suppressed
plt.hist(x);

- We can see that, by adding a semicolon at the end, the output is suppressed.

# **14. Speed up EDA with Pandas Profiling**<a class="anchor" id="14"></a>

[Table of Contents](#0.1)

- `Pandas profiling` is an open source Python module with which we can quickly do an exploratory data analysis with just a few lines of code. 
-  In short, what pandas profiling does is save us all the work of visualizing and understanding the distribution of each variable.
- For more information on `pandas profiling`, please see my other kernel - [EDA is fun](https://www.kaggle.com/prashant111/eda-is-fun)

In [None]:
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.

In [None]:
df = pd.read_csv('/kaggle/input/titanic/train_and_test2.csv')


In [None]:
import pandas_profiling as pp

pp.ProfileReport(df)

- We can see that `Pandas Profiling` is really helpful in generating a detailed exploratory analysis report.

# **15. Cross-tabulation with Pandas crosstab() function**<a class="anchor" id="15"></a>

[Table of Contents](#0.1)

- The `Pandas crosstab()` function is used to compute a simple cross tabulation of two (or more) factors. By default, it computes a frequency table of the factors unless an array of values and an aggregation function are passed.

- For more information, see the official documentation [Pandas crosstab](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.crosstab.html)

In [None]:
# importing packages 
import pandas as pd
import numpy as np
  
# creating some data 
a = np.array(["foo", "foo", "foo", "foo", 
                 "bar", "bar", "bar", "bar", 
                 "foo", "foo", "foo"], 
                dtype=object) 
  
b = np.array(["one", "one", "one", "two", 
                 "one", "one", "one", "two", 
                 "two", "two", "one"], 
                dtype=object) 
  
c = np.array(["dull", "dull", "shiny", 
                 "dull", "dull", "shiny", 
                 "shiny", "dull", "shiny", 
                 "shiny", "shiny"], 
                dtype=object) 
  
# form the cross tab 
pd.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c']) 

- We can see that crosstab() has created a nice frequency table of the factors.

# **16. References**<a class="anchor" id="16"></a>

[Table of Contents](#0.1)


The ideas and concepts in this notebook have been taken from the following website-

- https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/

- https://www.analyticsvidhya.com/blog/2019/08/10-powerful-python-tricks-data-science/

- https://courses.analyticsvidhya.com/courses/data-science-hacks-tips-and-tricks?utm_source=hacksandtipsbanner&utm_medium=blog

- https://www.marktechpost.com/2019/09/19/pyforest-importing-python-data-science-libraries-in-one-line-of-code/



- Thus, we will come to the end of this notebook. I hope you find this notebook useful and enjoyable.

- If you have more tips and tricks up your sleeve, then please share it with the community.

- Your comments and feedback are most welcome.

- Thank you


[Go to Top](#0)