# Explore data

In [1]:
import os, sys
sys.path.append(os.path.abspath('../../'))
import tally
dataset = tally.DataSet(api_key=os.environ.get('tally_api_key'))
dataset.use_spss('./data/Example Data (A).sav')

 
Once our data is loaded, we can explore what variables it has and the meta data for the variables. We assume that the data [has been loaded](../1_load_data) into a variable called `dataset`. For more information on the API endpoints used in these examples, refer to the [Tally API documentation](https://tally.datasmoothie.com).

## `variables` - listing available variables by type 

Use the [variables](https://tally.datasmoothie.com/#tag/Data-Processing/operation/variables) method to get a list of variables. It returns a dictionary with the keys keys `single`, `delimited set`, `array`, `int`, `float`, `string`, `date`. These all have a list of strings that are the names of variables.

In [2]:
dataset.variables()

{'single': ['gender',
  'locality',
  'ethnicity',
  'religion',
  'q1',
  'q2__1',
  'q2__2',
  'q2__3',
  'q2__4',
  'q2__5',
  'q2__6',
  'q2__97',
  'q2__98',
  'q2b',
  'q3__1',
  'q3__2',
  'q3__3',
  'q3__4',
  'q3__5',
  'q3__6',
  'q3__7',
  'q3__8',
  'q3__97',
  'q4',
  'q5_1',
  'q5_2',
  'q5_3',
  'q5_4',
  'q5_5',
  'q5_6',
  'q6_1',
  'q6_2',
  'q6_3',
  'q7_1',
  'q7_2',
  'q7_3',
  'q7_4',
  'q7_5',
  'q7_6',
  'q8__1',
  'q8__2',
  'q8__3',
  'q8__4',
  'q8__5',
  'q8__96',
  'q8__98',
  'q9__1',
  'q9__2',
  'q9__3',
  'q9__4',
  'q9__96',
  'q9__98',
  'q9__99',
  'Wave',
  'q14r01c01',
  'q14r01c02',
  'q14r01c03',
  'q14r02c01',
  'q14r02c02',
  'q14r02c03',
  'q14r03c01',
  'q14r03c02',
  'q14r03c03',
  'q14r04c01',
  'q14r04c02',
  'q14r04c03',
  'q14r05c01',
  'q14r05c02',
  'q14r05c03',
  'q14r06c01',
  'q14r06c02',
  'q14r06c03',
  'q14r07c01',
  'q14r07c02',
  'q14r07c03',
  'q14r08c01',
  'q14r08c02',
  'q14r08c03',
  'q14r09c01',
  'q14r09c02',
  'q14r09c0

## `meta` - explore answer labels and codes 

Use the [meta](https://tally.datasmoothie.com/#tag/Data-Processing/operation/meta) method to explore answer codes and labels.

In [3]:
dataset.meta(variable='q1')

Unnamed: 0,codes,texts,missing
1,1,Swimming,
2,2,Running/jogging,
3,3,Lifting weights,
4,4,Aerobics,
5,5,Yoga,
6,6,Pilates,
7,7,Football (soccer),
8,8,Basketball,
9,9,Hockey,
10,96,Other,


## Other methods to explore data

Other methods to explore the data include **get_variable_text**, **find**, **values** and other methods in the [DataSet class](api_dataset).

In [4]:
dataset.get_variable_text(name='locality')

'How would you describe the areas in which you live?'

In [5]:
dataset.find(str_tags=['q2'])

{'variables': ['q2__1',
  'q2__2',
  'q2__3',
  'q2__4',
  'q2__5',
  'q2__6',
  'q2__97',
  'q2__98',
  'q2b'],
 'params': {'str_tags': ['q2']}}

In [6]:
dataset.values(name='locality')

{'values': {'1': 'CBD (central business district)',
  '2': 'Urban',
  '3': 'Suburban',
  '4': 'Rural',
  '5': 'Remote'},
 'params': {'name': 'locality'}}

## Pandas dataframe

The data can also be accessed as a pandas dataframe.

In [7]:
df = dataset.get_dataframe()
df.head()

Unnamed: 0.1,Unnamed: 0,record_number,unique_id,age,birth_day,birth_month,birth_year,gender,locality,ethnicity,...,q14r08c01,q14r08c02,q14r08c03,q14r09c01,q14r09c02,q14r09c03,q14r10c01,q14r10c02,q14r10c03,@1
0,0,2.0,402891.0,22.0,25.0,3.0,1993.0,1.0,1.0,1.0,...,3.0,4.0,,1.0,4.0,,3.0,1.0,,1.0
1,1,3.0,27541022.0,22.0,27.0,11.0,1993.0,2.0,3.0,1.0,...,4.0,,,4.0,,,2.0,,,1.0
2,2,4.0,335506.0,28.0,3.0,11.0,1987.0,1.0,2.0,1.0,...,2.0,1.0,,3.0,2.0,,2.0,1.0,,1.0
3,3,5.0,22885610.0,31.0,11.0,9.0,1984.0,1.0,2.0,1.0,...,3.0,3.0,,4.0,1.0,,4.0,3.0,,1.0
4,4,6.0,229122.0,38.0,24.0,4.0,1977.0,1.0,1.0,1.0,...,1.0,,,4.0,,,1.0,,,1.0
