# Getting tables from the database

This short tutorial explains how to retrieve full tables from the database into [pandas DataFrames](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).

## The following table are available from ``mendeleev``

* elements
* ionicradii
* ionizationenergies
* oxidationstates
* groups
* series
* isotopes

``mendeleev`` provides a convenient function `get_table` to perform the task at hand. The function can be directly imported from `mendeleev`

In [1]:
from mendeleev import get_table

To retrieve a table call the ``get_table`` with the table name as argument. Here we'll get probably the most important table ``elements`` with basis data on each element

In [2]:
ptable = get_table('elements')

Now we can use [pandas'](http://pandas.pydata.org) capabilities to work with the data. 

In [3]:
ptable.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 118 entries, 0 to 117
Data columns (total 53 columns):
annotation                   118 non-null object
atomic_number                118 non-null int64
atomic_radius                88 non-null float64
atomic_volume                91 non-null float64
block                        118 non-null object
boiling_point                96 non-null float64
density                      95 non-null float64
description                  109 non-null object
dipole_polarizability        106 non-null float64
electron_affinity            77 non-null float64
electronic_configuration     118 non-null object
evaporation_heat             88 non-null float64
fusion_heat                  75 non-null float64
group_id                     90 non-null float64
lattice_constant             87 non-null float64
lattice_structure            91 non-null object
melting_point                100 non-null float64
name                         118 non-null object
period       

For clarity let's take only a subset of columns 

In [4]:
cols = ['atomic_number', 'symbol', 'atomic_radius', 'en_pauling', 'block', 'vdw_radius_mm3']

In [5]:
ptable[cols].head()

Unnamed: 0,atomic_number,symbol,atomic_radius,en_pauling,block,vdw_radius_mm3
0,1,H,79.0,2.2,s,162.0
1,2,He,,,s,153.0
2,3,Li,155.0,0.98,s,255.0
3,4,Be,112.0,1.57,s,223.0
4,5,B,98.0,2.04,p,215.0


It is quite easy now to get descriptive statistics on the data.

In [6]:
ptable[cols].describe()

Unnamed: 0,atomic_number,atomic_radius,en_pauling,vdw_radius_mm3
count,118.0,88.0,85.0,94.0
mean,59.5,169.397727,1.748588,248.468085
std,34.207699,49.810108,0.634442,36.017828
min,1.0,79.0,0.7,153.0
25%,30.25,137.0,1.24,229.0
50%,59.5,160.0,1.7,244.0
75%,88.75,181.0,2.16,269.25
max,118.0,299.0,3.98,364.0


## Isotopes table

Let try and retrieve another table, namely ``isotopes``

In [7]:
isotopes = get_table('isotopes', index_col='id')

In [8]:
isotopes.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 377 entries, 1 to 377
Data columns (total 8 columns):
atomic_number       377 non-null int64
mass                377 non-null float64
abundance           288 non-null float64
mass_number         377 non-null int64
mass_uncertainty    377 non-null float64
is_radioactive      377 non-null bool
half_life           121 non-null float64
half_life_unit      85 non-null object
dtypes: bool(1), float64(4), int64(2), object(1)
memory usage: 23.9+ KB


### Merge the elements table with the isotopes

We can now perform SQL-like merge operation on two ``DataFrame``s and produce an [outer](http://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging) join 

In [9]:
import pandas as pd

In [10]:
merged = pd.merge(ptable[cols], isotopes, how='outer', on='atomic_number')

now we have the following columns in the ``merged`` ``DataFrame``

In [11]:
merged.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 377 entries, 0 to 376
Data columns (total 13 columns):
atomic_number       377 non-null int64
symbol              377 non-null object
atomic_radius       300 non-null float64
en_pauling          291 non-null float64
block               377 non-null object
vdw_radius_mm3      321 non-null float64
mass                377 non-null float64
abundance           288 non-null float64
mass_number         377 non-null int64
mass_uncertainty    377 non-null float64
is_radioactive      377 non-null bool
half_life           121 non-null float64
half_life_unit      85 non-null object
dtypes: bool(1), float64(7), int64(2), object(3)
memory usage: 38.7+ KB


In [12]:
merged.head()

Unnamed: 0,atomic_number,symbol,atomic_radius,en_pauling,block,vdw_radius_mm3,mass,abundance,mass_number,mass_uncertainty,is_radioactive,half_life,half_life_unit
0,1,H,79.0,2.2,s,162.0,1.007825,0.99972,1,6e-10,False,,
1,1,H,79.0,2.2,s,162.0,2.014102,0.00028,2,8e-10,False,,
2,2,He,,,s,153.0,3.016029,2e-06,3,2e-08,False,,
3,2,He,,,s,153.0,4.002603,0.999998,4,4e-10,False,,
4,3,Li,155.0,0.98,s,255.0,6.015123,0.078,6,9e-09,False,,


To display all the isotopes of Silicon

In [13]:
merged[merged['symbol'] == 'Si']

Unnamed: 0,atomic_number,symbol,atomic_radius,en_pauling,block,vdw_radius_mm3,mass,abundance,mass_number,mass_uncertainty,is_radioactive,half_life,half_life_unit
25,14,Si,132.0,1.9,p,229.0,27.976927,0.92191,28,3e-09,False,,
26,14,Si,132.0,1.9,p,229.0,28.976495,0.04699,29,3e-09,False,,
27,14,Si,132.0,1.9,p,229.0,29.97377,0.0311,30,2e-08,False,,


In [14]:
%version_information mendeleev, numpy, scipy, pandas

Software,Version
Python,3.5.2 64bit [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
IPython,5.1.0
OS,Linux 3.16.0 4 amd64 x86_64 with debian 8.6
mendeleev,0.2.17
numpy,1.11.2
scipy,0.18.1
pandas,0.19.2+0.g825876c.dirty
Mon Jan 09 00:31:50 2017 CET,Mon Jan 09 00:31:50 2017 CET
