# Example: Import data to visualize sets and intersections

This example shows the methods that you can use to import data and visualize set-type data. The data is stored in a Membership object, which stores the sets in each intersection, and the combination of sets that each record belongs to.

# Includes: setvis and other libraries

In [None]:
import pandas as pd

from setvis import Membership
from setvis.plots import PlotSession

## Visualize the sets and intersections

The following three data import methods produce identical visualizations.

In [None]:
# This is a TAB-delimited file
input_file = "../examples/datasets/convenience_store_table.txt"

### Create a Membership object directly from the input file

In [None]:
set1 = Membership.from_csv(input_file, read_csv_args={'sep':'\t'}, set_mode=True)
# PlotSession is the core class that provides the functionality to analyse and explore the missingness patterns found in a dataset
set_session1 = PlotSession(set1)
# To visualise the dataset, call add_plot(), providing a name.
# Naming the plot is important: It allows any interactive selection made in the plot to be referred to later.
# The result is a Bokeh widget with a number of tabs, each with a different visualisation of the missingness data.
set_session1.add_plot(name="example")

### Import the file as a data frame and then create a Membership object

In [None]:
df2 = pd.read_csv(input_file, sep='\t')
set2 = Membership.from_data_frame(df2, set_mode=True)
# Create a PlotSession object and visualise the patterns of missing values
set_session2 = PlotSession(set2)
set_session2.add_plot(name="example")

### Create a PlotSession object directly from the input file

In [None]:
df3 = pd.read_csv(input_file, sep='\t')
set_session3 = PlotSession(df3, set_mode=True)
set_session3.add_plot(name="example")

## Create the same visualizations, but from a compact data file

In [None]:
# This is a TAB-delimited file, and the sets are stored in the Product column as comma-separated values
input_file = "../examples/datasets/convenience_store_membership.txt"

### Create a Membership object directly from the input file

In [None]:
memset1 = Membership.from_membership_csv(input_file, read_csv_args={'sep':'\t'}, membership_column='Product', membership_separator=',')

# PlotSession is the core class that provides the functionality to analyse and explore the missingness patterns found in a dataset
memset_session1 = PlotSession(memset1)
# To visualise the dataset, call add_plot(), providing a name.
# Naming the plot is important: It allows any interactive selection made in the plot to be referred to later.
# The result is a Bokeh widget with a number of tabs, each with a different visualisation of the missingness data.
memset_session1.add_plot(name="example")

### Import the file as a data frame and then create a Membership object

In [None]:
df4 = pd.read_csv(input_file, sep='\t')
memset2 = Membership.from_membership_data_frame(df4, membership_column='Product', membership_separator=',')
# Create a PlotSession object and visualise the patterns of missing values
memset_session2 = PlotSession(memset2)
memset_session2.add_plot(name="example")