Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. It can be thought of as a dict-like container for Series objects. This is the primary data structure of the Pandas. 

Pandas DataFrame.to_sparse
Pandas DataFrame.to_sparse() function convert to SparseDataFrame. The function implements the sparse version of the DataFrame meaning that any data matching a specific value it’s omitted in the representation. The sparse DataFrame allows for more efficient storage.

Sparse objects are compressed when any data matching a specific value (NaN/missing value,though any value can be chosen) is omitted. A special SparseIndex objects tracks where data has been sparsified. This will make much more sense in an example

**Pandas SparseDataFrame Example**
Example 1: Use DataFrame.to_sparse() function to convert the given Dataframe to a SparseDataFrame for efficient storage. 

In [1]:
# importing pandas as pd
import pandas as pd

# Creating the DataFrame
df = pd.DataFrame({'Weight': [45, 88, 56, 15, 71],
				'Name': ['Sam', 'Andrea', 'Alex', 'Robin', 'Kia'],
				'Age': [14, 25, 55, 8, 21]})

# Create the index
index_ = pd.date_range('2010-10-09 08:45', periods=5, freq='H')

# Set the index
df.index = index_

# Print the DataFrame
print(df)


                     Weight    Name  Age
2010-10-09 08:45:00      45     Sam   14
2010-10-09 09:45:00      88  Andrea   25
2010-10-09 10:45:00      56    Alex   55
2010-10-09 11:45:00      15   Robin    8
2010-10-09 12:45:00      71     Kia   21


Now we will use DataFrame.to_sparse() function to convert the given dataframe to a SparseDataFrame. 

In [2]:
# convert to SparseDataFrame
result = df.to_sparse()

# Print the result
print(result)

# Verify the result by checking the
# type of the object.
print(type(result))


AttributeError: 'DataFrame' object has no attribute 'to_sparse'

In [4]:
import numpy as np
df = pd.DataFrame([10, np.nan, np.nan, np.nan], 
                  index=['apple', 'banana', 'mango', 'ananas'], 
                  columns=['Quantity'])
spdf = df.astype(pd.SparseDtype())
print(spdf)
print(type(spdf))
print(spdf.dtypes)

        Quantity
apple       10.0
banana       NaN
mango        NaN
ananas       NaN
<class 'pandas.core.frame.DataFrame'>
Quantity    Sparse[float64, nan]
dtype: object


In [6]:
import pandas as pd
import numpy as np

ts = pd.Series(np.random.randn(10))
ts[2:-2] =  np.nan
print(ts)
sts = ts.astype(pd.SparseDtype(int, fill_value=0))
print(sts)

0   -1.131425
1   -0.401397
2         NaN
3         NaN
4         NaN
5         NaN
6         NaN
7         NaN
8    0.480376
9   -1.476868
dtype: float64


IntCastingNaNError: Cannot convert non-finite values (NA or inf) to integer