# Dataset

**Dataset** is a abstraction of local file system.
Users can add their local paths into this system to easily access the data inside.
The basic concept is to treat a data file as a property of a ``Dataset`` object.
When users call these properties, ``Dataset`` will load the data files automatically.

The following tutorial shows how easy it is to interactive with data in this system.

In [1]:
from xenonpy.datatools import Dataset

# use dir path as parameters when initlization
ds = Dataset('set1', 'set2')
ds

<Dataset> includes:
"data1": /Users/liuchang/projects/xenonpy/samples/set1/data1.pd.xz
"data2": /Users/liuchang/projects/xenonpy/samples/set2/data2.pd.xz

In [2]:
# load data

ds.data1

Unnamed: 0,0,1
0,1,2
1,3,4


In [3]:
# change backend

ds_csv = ds.csv
ds_csv

<Dataset> includes:
"data1": /Users/liuchang/projects/xenonpy/samples/set1/data1.csv
"data2": /Users/liuchang/projects/xenonpy/samples/set2/data2.csv

In [4]:
ds_csv.data2

Unnamed: 0.1,Unnamed: 0,0,1
0,0,1,2
1,1,3,4


In [5]:
# set backend at init

ds = Dataset('set1', 'set2', backend='pickle')
ds

<Dataset> includes:
"data1": /Users/liuchang/projects/xenonpy/samples/set1/data1.pkl.z
"data2": /Users/liuchang/projects/xenonpy/samples/set2/data2.pkl.z

# Preset

Currently, two sets of element-level property data are available (``elements`` and ``elements_completed`` (imputed version of ``elements``)).

In [6]:
from xenonpy.datatools import preset

preset.elements.head(5)

Unnamed: 0,atomic_number,atomic_radius,atomic_radius_rahm,atomic_volume,atomic_weight,boiling_point,brinell_hardness,bulk_modulus,c6,c6_gb,...,vdw_radius_bondi,vdw_radius_dreiding,vdw_radius_mm3,vdw_radius_rt,vdw_radius_truhlar,vdw_radius_uff,sound_velocity,vickers_hardness,Polarizability,youngs_modulus
H,1,79.0,154.0,14.1,1.008,20.28,,,6.499027,6.51,...,120.0,319.5,162.0,110.0,,288.6,1270.0,,0.666793,
He,2,,134.0,31.8,4.002602,4.216,,,1.42,1.47,...,140.0,,153.0,,,236.2,970.0,,0.205052,
Li,3,155.0,220.0,13.1,6.94,1118.15,,11.0,1392.0,1410.0,...,181.0,,255.0,,,245.1,6000.0,,24.33,4.9
Be,4,112.0,219.0,5.0,9.012183,3243.0,600.0,130.0,227.0,214.0,...,,,223.0,,153.0,274.5,13000.0,1670.0,5.6,287.0
B,5,98.0,205.0,4.6,10.81,3931.0,,320.0,99.5,99.2,...,,402.0,215.0,,192.0,408.3,16200.0,49000.0,3.03,


In [7]:
preset.elements_completed.head(5)

Unnamed: 0,atomic_number,atomic_radius,atomic_radius_rahm,atomic_volume,atomic_weight,boiling_point,bulk_modulus,c6_gb,covalent_radius_cordero,covalent_radius_pyykko,...,num_s_valence,period,specific_heat,thermal_conductivity,vdw_radius,vdw_radius_alvarez,vdw_radius_mm3,vdw_radius_uff,sound_velocity,Polarizability
H,1.0,79.0,154.0,14.1,1.008,20.28,56.79964,6.51,31.0,32.0,...,1.0,1.0,1.122728,0.1805,110.0,120.0,162.0,288.6,1270.0,0.666793
He,2.0,147.832643,134.0,31.8,4.002602,4.216,85.10663,1.47,28.0,46.0,...,2.0,1.0,5.188,0.1513,140.0,143.0,153.0,236.2,970.0,0.205052
Li,3.0,155.0,220.0,13.1,6.94,1118.15,11.0,1410.0,128.0,133.0,...,1.0,2.0,3.489,85.0,182.0,212.0,255.0,245.1,6000.0,24.33
Be,4.0,112.0,219.0,5.0,9.012183,3243.0,130.0,214.0,96.0,102.0,...,2.0,2.0,1.824,190.0,153.0,198.0,223.0,274.5,13000.0,5.6
B,5.0,98.0,205.0,4.6,10.81,3931.0,320.0,99.2,84.0,85.0,...,2.0,2.0,1.025,27.0,192.0,191.0,215.0,408.3,16200.0,3.03
