# EDA

## Analyze and visualize data interactively with D-Tale

Do you still perform your EDA manually?

And do the same repetitive steps?

Let `D-Tale` help you.

`D-Tale` is a powerful Python library that allows you to easily inspect and analyze your data interactively.

You can view your data in a web-based interface with various visualizations.

`D-Tale` supports a wide variety of data types and formats, including CSV, Excel, JSON, SQL, and more.

Bonus: You can export the corresponding source code for your steps.

In [None]:
!pip install dtale

In [None]:
import dtale
import pandas as pd

df = pd.DataFrame([dict(a=1,b=2,c=3)])

d = dtale.show(df)

## Use Dark Mode in Matplotlib

For all Dark Mode fans:

You can use Matplotlib in Dark Mode too.

Just set the background appropriately.

In [None]:
import matplotlib.pyplot as plt

plt.style.use('dark_background')

fig, ax = plt.subplots()

plt.plot(range(1,5), range(1,5))

## No-Code EDA with `PandasGUI`

`PandasGUI` provides a PyQT application to analyze and interactively plot your Pandas DataFrames.

Without writing a lot of code.

It offers various functionalities like:

- Filtering
- Summary Statistics
- Different Visualizations like Word Clouds, Bar Charts, etc.

In [None]:
from pandasgui import show
from pandasgui.datasets import pokemon
show(pokemon)

## Generate Publication-ready plots 

Make Scientific Plots with Python and Matplotlib

For making publication-level plots, just use `LovelyPlots`.

This package provides a new theme to use for maptlotlib.

You just have to add one line to upgrade your plots.

In [None]:
!pip install LovelyPlots

In [None]:
# Example from LovelyPlot's repository
import matplotlib.pyplot as plt
import numpy as np

#%%
def plot_dist(
    temperatures,
    v,
    mass=85 * 1.66e-27,
    pparam={"xlabel": "Speed", "ylabel": "Speed distribution"},,
):

    fig, ax = plt.subplots()
    for T in temperatures:
        fv = MB_speed(v, mass, T)
        ax.plot(v, fv, label=f"T={T}K")
        ax.legend()
        ax.set(**pparam)


v = np.arange(0, 800, 10)
temperatures = [i for i in range(100, 500, 75)]


def MB_speed(v, m, T):
    """Maxwell-Boltzmann speed distribution for speeds"""
    kB = 1.38e-23
    return (
        (m / (2 * np.pi * kB * T)) ** 1.5 * 4 * np.pi * v**2 * np.exp(-m * v**2 / (2 * kB * T))
    )

plt.style.use('ipynb')


plot_dist(temperatures, v)