# 5. Crosstabs and aggregations

Tally has a powerful aggregation engine, which allows us to examine our data at every step of the way. 

Tally's aggregation engine supports
- weights
- significance testing
- statistics (mean, stddev, sum, etc)
- nesting
- filters (using logic to select rows to aggregate)
- arrays (i.e. grids/loops)

In [None]:
#
# In order to run this notebook, you first have to install Tally. To install tally you need a token that gives you access.
#
from google.colab import files
import json
import io
import os
# Check if the file 'tally_keys.json' exists
if not os.path.exists('tally_keys.json'):
  uploaded = files.upload()
  # Assuming only one file is uploaded, get its filename and content
  filename = list(uploaded.keys())[0]
  file_content = uploaded[filename]
  # Load JSON directly from the uploaded content
  keys = json.loads(file_content.decode('utf-8'))
else:
  # If the file already exists, just load its content
  with open('tally_keys.json', 'r') as f:
      keys = json.load(f)

try:
  # Try to import the package
  import example_package
except ImportError:
  # If the import fails, the package is not installed. Install it.
  !pip install git+https://{keys['tally_api']}@github.com/datasmoothie/tally-core.git@master

In [1]:
import tally_core as tc
import pandas as pd
import json
dataset = tc.DataSet('Museum')

dataset.read_gvn('./data/Example Data (A).json', './data/Example Data (A).parquet')

## Selecting the contents of a crosstab
We will use the Sport store test dataset. To explore our crosstabs we will use the variable `q1` ("what is your main fitness activity") along with `gender` and `locality`.

The parameters available for the crosstab method are:

- `x` (required)
- `y` - default []
- `w` - default None
- `f` - default None
- `ci` - default counts
- `base` - default auto
- `stats` - default None
- `sig_level` - default None
- `decimals` - default 1
- `xtotal` - default False
- `text_key` - default None


### Side/stub variables

The only required parameter is the x variable, often called the "side" or "stub" variable. By default, this will show the number of responses for each answers, i.e. the count.

In [2]:
dataset.crosstab(x='q1')

Unnamed: 0_level_0,Question,Total
Unnamed: 0_level_1,Values,Total
Question,Values,Unnamed: 2_level_2
q1. What is your main fitness activity?,Base,8255.0
q1. What is your main fitness activity?,Swimming,297.0
q1. What is your main fitness activity?,Running/jogging,397.0
q1. What is your main fitness activity?,Lifting weights,2298.0
q1. What is your main fitness activity?,Aerobics,2999.0
q1. What is your main fitness activity?,Yoga,194.0
q1. What is your main fitness activity?,Pilates,477.0
q1. What is your main fitness activity?,Football (soccer),894.0
q1. What is your main fitness activity?,Basketball,131.0
q1. What is your main fitness activity?,Hockey,4.0


### Top/banner variables
We provide the `y` parameter to get "top" or "banner" variables.

In [3]:
dataset.crosstab(x='q1', y=['gender', 'locality'])

Unnamed: 0_level_0,Question,gender. What is your gender?,gender. What is your gender?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?
Unnamed: 0_level_1,Values,Male,Female,CBD (central business district),Urban,Suburban,Rural,Remote
Question,Values,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
q1. What is your main fitness activity?,Base,3952.0,4303.0,3106.0,2245.0,1180.0,718.0,829.0
q1. What is your main fitness activity?,Swimming,145.0,152.0,140.0,72.0,38.0,15.0,25.0
q1. What is your main fitness activity?,Running/jogging,205.0,192.0,156.0,104.0,55.0,29.0,45.0
q1. What is your main fitness activity?,Lifting weights,1094.0,1204.0,835.0,671.0,320.0,175.0,243.0
q1. What is your main fitness activity?,Aerobics,1438.0,1561.0,1200.0,767.0,435.0,270.0,271.0
q1. What is your main fitness activity?,Yoga,95.0,99.0,70.0,49.0,33.0,18.0,19.0
q1. What is your main fitness activity?,Pilates,199.0,278.0,154.0,127.0,82.0,47.0,52.0
q1. What is your main fitness activity?,Football (soccer),447.0,447.0,315.0,241.0,131.0,87.0,98.0
q1. What is your main fitness activity?,Basketball,65.0,66.0,44.0,35.0,24.0,15.0,12.0
q1. What is your main fitness activity?,Hockey,3.0,1.0,2.0,1.0,1.0,0.0,0.0


### Weight
We can apply a weight variable to the crosstab with `w`.

In [4]:
dataset.crosstab(x='q1', y='gender', w='weight_a')

Unnamed: 0_level_0,Question,gender. What is your gender?,gender. What is your gender?
Unnamed: 0_level_1,Values,Male,Female
Question,Values,Unnamed: 2_level_2,Unnamed: 3_level_2
q1. What is your main fitness activity?,Base,3970.0,4284.0
q1. What is your main fitness activity?,Swimming,142.3,139.8
q1. What is your main fitness activity?,Running/jogging,181.2,185.8
q1. What is your main fitness activity?,Lifting weights,1040.3,1119.6
q1. What is your main fitness activity?,Aerobics,1545.2,1647.3
q1. What is your main fitness activity?,Yoga,107.2,112.2
q1. What is your main fitness activity?,Pilates,188.8,297.6
q1. What is your main fitness activity?,Football (soccer),462.7,471.7
q1. What is your main fitness activity?,Basketball,54.3,64.4
q1. What is your main fitness activity?,Hockey,4.9,0.5


### Filter (case selection)

We can select a subset of the data to run the crosstab on with the `f` variable. The filter understand complex [Tally logic operators](tally_logic). Here, we set the filter to only show `gender` answers with code 2 (female).

In [5]:
dataset.crosstab(x='q1', y='gender', f={'gender':[2]})

Unnamed: 0_level_0,Question,gender. What is your gender?,gender. What is your gender?
Unnamed: 0_level_1,Values,Male,Female
Question,Values,Unnamed: 2_level_2,Unnamed: 3_level_2
q1. What is your main fitness activity?,Base,0.0,4303.0
q1. What is your main fitness activity?,Swimming,0.0,152.0
q1. What is your main fitness activity?,Running/jogging,0.0,192.0
q1. What is your main fitness activity?,Lifting weights,0.0,1204.0
q1. What is your main fitness activity?,Aerobics,0.0,1561.0
q1. What is your main fitness activity?,Yoga,0.0,99.0
q1. What is your main fitness activity?,Pilates,0.0,278.0
q1. What is your main fitness activity?,Football (soccer),0.0,447.0
q1. What is your main fitness activity?,Basketball,0.0,66.0
q1. What is your main fitness activity?,Hockey,0.0,1.0


### Cell contents
We can select what to show in the cells, using `counts`, `c%`, and `r%` for counts, column percentages and row percentages.

In [6]:
dataset.crosstab(x='q1', y='gender', ci=['counts', 'r%'])

Unnamed: 0_level_0,Question,gender. What is your gender?,gender. What is your gender?
Unnamed: 0_level_1,Values,Male,Female
Question,Values,Unnamed: 2_level_2,Unnamed: 3_level_2
q1. What is your main fitness activity?,Base,3952.0,4303.0
q1. What is your main fitness activity?,Swimming,145.0,152.0
q1. What is your main fitness activity?,,48.8,51.2
q1. What is your main fitness activity?,Running/jogging,205.0,192.0
q1. What is your main fitness activity?,,51.6,48.4
q1. What is your main fitness activity?,Lifting weights,1094.0,1204.0
q1. What is your main fitness activity?,,47.6,52.4
q1. What is your main fitness activity?,Aerobics,1438.0,1561.0
q1. What is your main fitness activity?,,47.9,52.1
q1. What is your main fitness activity?,Yoga,95.0,99.0


### Base
When a crosstab is weighted, we can select what bases we want to show in the crosstab. The options are `weighted`, `unweighted`, `both`, and `auto`. Auto will only show one weight.

In [7]:
dataset.crosstab('q1', w='weight_a', base='both')

Unnamed: 0_level_0,Question,Total
Unnamed: 0_level_1,Values,Total
Question,Values,Unnamed: 2_level_2
q1. What is your main fitness activity?,Unweighted base,8255.0
q1. What is your main fitness activity?,Base,8255.0
q1. What is your main fitness activity?,Swimming,282.0
q1. What is your main fitness activity?,Running/jogging,367.0
q1. What is your main fitness activity?,Lifting weights,2159.9
q1. What is your main fitness activity?,Aerobics,3192.5
q1. What is your main fitness activity?,Yoga,219.4
q1. What is your main fitness activity?,Pilates,486.5
q1. What is your main fitness activity?,Football (soccer),934.4
q1. What is your main fitness activity?,Basketball,118.6


### Stats

In [20]:
dataset.crosstab('q1', 'gender', stats=['mean', 'sem', 'stddev', 'varcoeff', 'min', 'max', 'median', 'lower_q', 'upper_q'])

Unnamed: 0_level_0,Question,gender. What is your gender?,gender. What is your gender?
Unnamed: 0_level_1,Values,Male,Female
Question,Values,Unnamed: 2_level_2,Unnamed: 3_level_2
q1. What is your main fitness activity?,Base,3952.0,4303.0
q1. What is your main fitness activity?,Swimming,145.0,152.0
q1. What is your main fitness activity?,Running/jogging,205.0,192.0
q1. What is your main fitness activity?,Lifting weights,1094.0,1204.0
q1. What is your main fitness activity?,Aerobics,1438.0,1561.0
q1. What is your main fitness activity?,Yoga,95.0,99.0
q1. What is your main fitness activity?,Pilates,199.0,278.0
q1. What is your main fitness activity?,Football (soccer),447.0,447.0
q1. What is your main fitness activity?,Basketball,65.0,66.0
q1. What is your main fitness activity?,Hockey,3.0,1.0


### Significance testing

Significance testing is done by adding a value for the alpha parameter in column proportional testing. For alpha=0.05 we do

In [24]:
dataset.crosstab('q2b', 'gender', sig_level=0.05, stats=['mean'])

Unnamed: 0_level_0,Question,gender. What is your gender?,gender. What is your gender?
Unnamed: 0_level_1,Values,Male,Female
Unnamed: 0_level_2,Test-IDs,A,B
Question,Values,Unnamed: 2_level_3,Unnamed: 3_level_3
q2b. How regularly do you participate in any fitness or sports activity?,Base,2998.0,3309.0
q2b. How regularly do you participate in any fitness or sports activity?,Regularly,385.0,356.0
q2b. How regularly do you participate in any fitness or sports activity?,0.05,B,
q2b. How regularly do you participate in any fitness or sports activity?,Irregularly,2458.0,2689.0
q2b. How regularly do you participate in any fitness or sports activity?,0.05,,
q2b. How regularly do you participate in any fitness or sports activity?,Never,155.0,264.0
q2b. How regularly do you participate in any fitness or sports activity?,0.05,,A
q2b. How regularly do you participate in any fitness or sports activity?,Mean,1.9,2.0
q2b. How regularly do you participate in any fitness or sports activity?,0.05,,A


The header gets a new row, Test-ID, and each column gets a letter as an ID. In the table above, Men are significantly more likely to work out regularly than women, and women are significantly liklier to never work out than men. Note that this is a test dataset, not real data!

### Decimals

Crosstabs can have as many decimals as we like, and this is controlled with the `decimals` parameter.

In [28]:
dataset.crosstab('q2b', 'gender', ci=['c%'], decimals=4)

Unnamed: 0_level_0,Question,gender. What is your gender?,gender. What is your gender?
Unnamed: 0_level_1,Values,Male,Female
Question,Values,Unnamed: 2_level_2,Unnamed: 3_level_2
q2b. How regularly do you participate in any fitness or sports activity?,Base,2998.0,3309.0
q2b. How regularly do you participate in any fitness or sports activity?,Regularly,12.8419,10.7585
q2b. How regularly do you participate in any fitness or sports activity?,Irregularly,81.988,81.2632
q2b. How regularly do you participate in any fitness or sports activity?,Never,5.1701,7.9782


### Totals

We can add a column at the far left that shows the total for the dataset.

In [29]:
dataset.crosstab('q2b', 'gender', xtotal=True)

Unnamed: 0_level_0,Question,Total,gender. What is your gender?,gender. What is your gender?
Unnamed: 0_level_1,Values,Total,Male,Female
Question,Values,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
q2b. How regularly do you participate in any fitness or sports activity?,Base,6307.0,2998.0,3309.0
q2b. How regularly do you participate in any fitness or sports activity?,Regularly,741.0,385.0,356.0
q2b. How regularly do you participate in any fitness or sports activity?,Irregularly,5147.0,2458.0,2689.0
q2b. How regularly do you participate in any fitness or sports activity?,Never,419.0,155.0,264.0


### Languages

Tally supports multiple languages and crosstabs can apply labels from whatever languages were in the original dataset.

In [7]:
museum_dataset = tc.DataSet('museums')
museum_dataset.read_gvn('./data/Example_Museum.json', './data/Example_Museum.parquet')
museum_dataset.crosstab('museums', text_key='es-ES')
museum_dataset.crosstab('museums', text_key='en-GB')

Unnamed: 0_level_0,Question,Total
Unnamed: 0_level_1,Values,Total
Question,Values,Unnamed: 2_level_2
museums. ¿Qué museos o galerías de arte ha visitado o planea visitar?,Base,426.0
museums. ¿Qué museos o galerías de arte ha visitado o planea visitar?,Museo Nacional de Ciencias,333.0
museums. ¿Qué museos o galerías de arte ha visitado o planea visitar?,Museo del Diseño,92.0
museums. ¿Qué museos o galerías de arte ha visitado o planea visitar?,Instituto de Textiles y Modas,47.0
museums. ¿Qué museos o galerías de arte ha visitado o planea visitar?,Museo Arqueológico,26.0
museums. ¿Qué museos o galerías de arte ha visitado o planea visitar?,Galeria Nacional de Arte,19.0
museums. ¿Qué museos o galerías de arte ha visitado o planea visitar?,Galeria del Norte,21.0
museums. ¿Qué museos o galerías de arte ha visitado o planea visitar?,Otra,37.0
museums. ¿Qué museos o galerías de arte ha visitado o planea visitar?,No respondió,0.0


Unnamed: 0_level_0,Question,Total
Unnamed: 0_level_1,Values,Total
Question,Values,Unnamed: 2_level_2
museums. Which museums or art galleries have you visited or do you plan to visit?,Base,426.0
museums. Which museums or art galleries have you visited or do you plan to visit?,National Museum of Science,333.0
museums. Which museums or art galleries have you visited or do you plan to visit?,Museum of Design,92.0
museums. Which museums or art galleries have you visited or do you plan to visit?,Institute of Textiles and Fashion,47.0
museums. Which museums or art galleries have you visited or do you plan to visit?,Archeological Museum,26.0
museums. Which museums or art galleries have you visited or do you plan to visit?,National Art Gallery,19.0
museums. Which museums or art galleries have you visited or do you plan to visit?,Northern Gallery,21.0
museums. Which museums or art galleries have you visited or do you plan to visit?,Other,37.0
museums. Which museums or art galleries have you visited or do you plan to visit?,Not answered,0.0


## Nesting crosstabs

Nesting crosstabs is done by adding the `>` signal to the `y` parameter (i.e. the banner/top). The nesting mechanism supports an arbitrarily deep level of nesting.

In [None]:
dataset.crosstab('q1', 'gender > locality')

Unnamed: 0_level_0,Question,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?
Unnamed: 0_level_1,Values,Male,Male,Male,Male,Male,Female,Female,Female,Female,Female
Unnamed: 0_level_2,Question,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?
Unnamed: 0_level_3,Values,CBD (central business district),Urban,Suburban,Rural,Remote,CBD (central business district),Urban,Suburban,Rural,Remote
Question,Values,Unnamed: 2_level_4,Unnamed: 3_level_4,Unnamed: 4_level_4,Unnamed: 5_level_4,Unnamed: 6_level_4,Unnamed: 7_level_4,Unnamed: 8_level_4,Unnamed: 9_level_4,Unnamed: 10_level_4,Unnamed: 11_level_4
q1. What is your main fitness activity?,Base,1567.0,1061.0,559.0,332.0,352.0,1539.0,1184.0,621.0,386.0,477.0
q1. What is your main fitness activity?,Swimming,69.0,35.0,13.0,7.0,17.0,71.0,37.0,25.0,8.0,8.0
q1. What is your main fitness activity?,Running/jogging,88.0,50.0,29.0,16.0,19.0,68.0,54.0,26.0,13.0,26.0
q1. What is your main fitness activity?,Lifting weights,429.0,300.0,157.0,80.0,105.0,406.0,371.0,163.0,95.0,138.0
q1. What is your main fitness activity?,Aerobics,615.0,371.0,201.0,125.0,100.0,585.0,396.0,234.0,145.0,171.0
q1. What is your main fitness activity?,Yoga,32.0,22.0,22.0,7.0,10.0,38.0,27.0,11.0,11.0,9.0
q1. What is your main fitness activity?,Pilates,69.0,51.0,34.0,23.0,14.0,85.0,76.0,48.0,24.0,38.0
q1. What is your main fitness activity?,Football (soccer),163.0,125.0,62.0,34.0,52.0,152.0,116.0,69.0,53.0,46.0
q1. What is your main fitness activity?,Basketball,18.0,20.0,11.0,9.0,6.0,26.0,15.0,13.0,6.0,6.0
q1. What is your main fitness activity?,Hockey,1.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0


In [17]:
dataset.crosstab('q1', 'gender > q2b > q4 > locality')

Unnamed: 0_level_0,Question,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?,gender. What is your gender?
Unnamed: 0_level_1,Values,Male,Male,Male,Male,Male,Male,Male,Male,Male,Male,...,Female,Female,Female,Female,Female,Female,Female,Female,Female,Female
Unnamed: 0_level_2,Question,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,...,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?,q2b. How regularly do you participate in any fitness or sports activity?
Unnamed: 0_level_3,Values,Regularly,Regularly,Regularly,Regularly,Regularly,Regularly,Regularly,Regularly,Regularly,Regularly,...,Never,Never,Never,Never,Never,Never,Never,Never,Never,Never
Unnamed: 0_level_4,Question,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,...,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?,q4. Do you ever participate in sports activities with people in your household?
Unnamed: 0_level_5,Values,Yes,Yes,Yes,Yes,Yes,No,No,No,No,No,...,Yes,Yes,Yes,Yes,Yes,No,No,No,No,No
Unnamed: 0_level_6,Question,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,...,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?,locality. How would you describe the areas in which you live?
Unnamed: 0_level_7,Values,CBD (central business district),Urban,Suburban,Rural,Remote,CBD (central business district),Urban,Suburban,Rural,Remote,...,CBD (central business district),Urban,Suburban,Rural,Remote,CBD (central business district),Urban,Suburban,Rural,Remote
Question,Values,Unnamed: 2_level_8,Unnamed: 3_level_8,Unnamed: 4_level_8,Unnamed: 5_level_8,Unnamed: 6_level_8,Unnamed: 7_level_8,Unnamed: 8_level_8,Unnamed: 9_level_8,Unnamed: 10_level_8,Unnamed: 11_level_8,Unnamed: 12_level_8,Unnamed: 13_level_8,Unnamed: 14_level_8,Unnamed: 15_level_8,Unnamed: 16_level_8,Unnamed: 17_level_8,Unnamed: 18_level_8,Unnamed: 19_level_8,Unnamed: 20_level_8,Unnamed: 21_level_8,Unnamed: 22_level_8
q1. What is your main fitness activity?,Base,61.0,54.0,20.0,10.0,9.0,86.0,59.0,31.0,26.0,21.0,...,20.0,11.0,14.0,4.0,12.0,72.0,51.0,22.0,21.0,35.0
q1. What is your main fitness activity?,Swimming,1.0,2.0,0.0,0.0,1.0,4.0,2.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,2.0,1.0
q1. What is your main fitness activity?,Running/jogging,6.0,2.0,0.0,0.0,1.0,4.0,0.0,1.0,1.0,0.0,...,0.0,0.0,1.0,1.0,2.0,3.0,3.0,1.0,1.0,4.0
q1. What is your main fitness activity?,Lifting weights,16.0,16.0,5.0,2.0,3.0,30.0,16.0,13.0,4.0,10.0,...,3.0,4.0,4.0,1.0,4.0,23.0,13.0,9.0,9.0,10.0
q1. What is your main fitness activity?,Aerobics,23.0,22.0,8.0,3.0,2.0,34.0,25.0,10.0,16.0,9.0,...,8.0,4.0,8.0,2.0,6.0,34.0,27.0,8.0,5.0,7.0
q1. What is your main fitness activity?,Yoga,2.0,1.0,3.0,1.0,0.0,0.0,2.0,0.0,0.0,0.0,...,2.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0
q1. What is your main fitness activity?,Pilates,4.0,2.0,2.0,0.0,1.0,3.0,5.0,2.0,1.0,0.0,...,1.0,2.0,0.0,0.0,0.0,3.0,3.0,1.0,0.0,4.0
q1. What is your main fitness activity?,Football (soccer),4.0,7.0,0.0,1.0,1.0,10.0,4.0,3.0,3.0,2.0,...,3.0,1.0,0.0,0.0,0.0,3.0,1.0,0.0,2.0,2.0
q1. What is your main fitness activity?,Basketball,2.0,1.0,1.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
q1. What is your main fitness activity?,Hockey,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Working with arrays (grids/loops)

Crosstabs also work with `arrays` (grids/loops). We use the museum dataset again.

In [12]:
museum_dataset.crosstab('rating.Column')

Unnamed: 0_level_0,Question,Total,rating.Column. Q30,rating.Column. Q30,rating.Column. Q30,rating.Column. Q30,rating.Column. Q30
Unnamed: 0_level_1,Values,Base,Not at all interested (1),Not particularly interested (2),No opinion (3),Slightly interested (4),Very interested (5)
Array,Questions,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
rating.Column. Q30,Other,0.0,0.0,0.0,0.0,0.0,0.0
rating.Column. Q30,Dinosaurs,258.0,10.0,1.0,27.0,113.0,107.0
rating.Column. Q30,Conservation,56.0,3.0,2.0,7.0,30.0,14.0
rating.Column. Q30,Fish and reptiles,111.0,6.0,5.0,28.0,45.0,27.0
rating.Column. Q30,Fossils,65.0,6.0,1.0,15.0,25.0,18.0
rating.Column. Q30,Birds,94.0,9.0,8.0,18.0,32.0,27.0
rating.Column. Q30,Insects,79.0,13.0,2.0,18.0,27.0,19.0
rating.Column. Q30,Whales,82.0,2.0,0.0,14.0,28.0,38.0
rating.Column. Q30,Mammals,138.0,4.0,3.0,26.0,55.0,50.0
rating.Column. Q30,Minerals,86.0,13.0,6.0,14.0,25.0,28.0
