# Using Pandas within DataSpell

Support for Pandas DataFrames is a core part of the functionality of DataSpell. Here are a few of the features that make working with Pandas easier.

## Interactive Pandas DataFrames

Within DataSpell, users can interact with Pandas DataFrames. This includes:
* displaying the entire DataFrame without trimming;
* changing the column sizes for readability;
* opening DataFrames in new tabs; and
* being able to order, copy and save DataFrames.

In [3]:
import pandas as pd

In [4]:
autos = pd.read_csv('https://github.com/mattharrison/datasets/raw/master/data/vehicles.csv.zip')

  autos = pd.read_csv('https://github.com/mattharrison/datasets/raw/master/data/vehicles.csv.zip')


In [3]:
autos

Unnamed: 0,barrels08,barrelsA08,charge120,charge240,city08,city08U,cityA08,cityA08U,cityCD,cityE,...,mfrCode,c240Dscr,charge240b,c240bDscr,createdOn,modifiedOn,startStop,phevCity,phevHwy,phevComb
0,15.695714,0.0,0.0,0.0,19,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
1,29.964545,0.0,0.0,0.0,9,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
2,12.207778,0.0,0.0,0.0,23,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
3,29.964545,0.0,0.0,0.0,10,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
4,17.347895,0.0,0.0,0.0,17,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
41139,14.982273,0.0,0.0,0.0,19,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
41140,14.330870,0.0,0.0,0.0,20,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
41141,15.695714,0.0,0.0,0.0,18,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
41142,15.695714,0.0,0.0,0.0,18,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0


In [10]:
(
    autos[["drive", "fuelType", "fuelCost08"]]
        .groupby(["drive", "fuelType"])
        .mean()
        .rename({"fuelCost08": "meanFuelCost"}, axis=1)
        .reset_index()
)

Unnamed: 0,drive,fuelType,meanFuelCost
0,2-Wheel Drive,Diesel,2197.916667
1,2-Wheel Drive,Electricity,1307.142857
2,2-Wheel Drive,Premium,4000.0
3,2-Wheel Drive,Regular,2752.857143
4,4-Wheel Drive,Diesel,2061.864407
5,4-Wheel Drive,Electricity,816.666667
6,4-Wheel Drive,Gasoline or E85,2534.782609
7,4-Wheel Drive,Midgrade,2876.666667
8,4-Wheel Drive,Premium,2754.340836
9,4-Wheel Drive,Premium and Electricity,2290.0


## Display of Jupyter variables

DataSpell also allows us to easily view the variables we have created in our Jupyter environment, including metadata like `dtype`.

## Autocompletion

The autocomplete for Pandas is context-sensitive, meaning you can get appropriate suggestions for paths, column names, and methods.

In [7]:
# Path completion
sample = pd.read_csv("sample_df.csv")

In [9]:
# Column and method completion
(
    sample["col1"]
        .value_counts()
)

a    1
b    1
c    1
Name: col1, dtype: int64

## Inbuilt documentation

DataSpell also has a couple of great features for accessing documentation:
* Introspection
* Documentation on hover

## Commit diffs
There is also very nice support for viewing the difference between commits of notebooks. Unlike in Github, these diffs render the notebook, allowing you to see the actual differences between notebooks.