## Practice with pandas

[pandas](https://pandas.pydata.org/), the Python Data Analysis Library, is an important resource.  pandas will give you many ways to practice your brand of data science, whatever the walk of life.

The pandas.DataFrame type lets you join pandas.Series type columns into a multi-column data table, complete with row and column names of your choice, both re-orderable.

Once you have a DataFrame defined, adding new columns based on the old, getting summary statistics, applying functions, generating visualizations, is all within reach.

In [1]:
# be skeptical of code like this -- but maybe?
name = 'math'
exec(f'import {name} as clc')

In [2]:
clc.sin(clc.radians(90)) # remembering trig

1.0

In [3]:
# and now for something completely different...
import numpy as np
import pandas as pd

Lets stack up a column of polyhedron names, using a kind of jargon or shorthand.

In [115]:
shapes = np.array(["Tetra", "Cubocta", "Icosa", "Cube", "Octa", 
                    "RT5", "RT5+", "RD", "RT", "Icosa", "Cubocta", 
                    "SuperRT", "Cube"], 
                    dtype=np.str_) 
shapes

array(['Tetra', 'Cubocta', 'Icosa', 'Cube', 'Octa', 'RT5', 'RT5+', 'RD',
       'RT', 'Icosa', 'Cubocta', 'SuperRT', 'Cube'], dtype='<U7')

So far that's a `numpy.ndarray` we've created.  Now lets bring that into a Series.

In [116]:
shapes_col = pd.Series(shapes, name="Shape")
shapes_col

0       Tetra
1     Cubocta
2       Icosa
3        Cube
4        Octa
5         RT5
6        RT5+
7          RD
8          RT
9       Icosa
10    Cubocta
11    SuperRT
12       Cube
Name: Shape, dtype: object

The vertical Series, a column of some data type (dtype), is the building block of the DataFrame, which sets them side by side in a tabular arrangement.

In [117]:
# geometric constants
phi = (1 + clc.sqrt(5))/2

# volumes of specific tetrahedral wedges
Emod  = clc.sqrt(2)/8 * 1/phi**3
Emod3 = clc.sqrt(2)/8
Smod  = (phi**-5) / 2

Sfactor = Smod/Emod

S3 = clc.sqrt(9/8)

# defined to have edges = 2R or 1D
Icosa = 100 * E3 + 20 * E

# volumes corresponding to our shapes
volumes = np.array([1, 2.5, 2.5 * Sfactor**2, 3, 4, 5, 120 * E, 6, 7.5, Icosa, 20, 20 * S3, 24],  dtype=np.float)

In [118]:
volumes_col = pd.Series(volumes, name="IVM Volume")  # turn np.array into a pd.Series

In [119]:
volumes_col

0      1.000000
1      2.500000
2      2.917961
3      3.000000
4      4.000000
5      5.000000
6      5.007758
7      6.000000
8      7.500000
9     18.512296
10    20.000000
11    21.213203
12    24.000000
Name: IVM Volume, dtype: float64

In [127]:
vols_table = pd.DataFrame({"Shape": shapes_col, "IVM Volume":volumes_col})
# vols_table.index = shapes_col  #  the shapes column is the index

In [128]:
vols_table

Unnamed: 0,Shape,IVM Volume
0,Tetra,1.0
1,Cubocta,2.5
2,Icosa,2.917961
3,Cube,3.0
4,Octa,4.0
5,RT5,5.0
6,RT5+,5.007758
7,RD,6.0
8,RT,7.5
9,Icosa,18.512296


In [129]:
vols_table['XYZ Volume'] = vols_table['IVM Volume'] * 1/S3

In [130]:
vols_table

Unnamed: 0,Shape,IVM Volume,XYZ Volume
0,Tetra,1.0,0.942809
1,Cubocta,2.5,2.357023
2,Icosa,2.917961,2.75108
3,Cube,3.0,2.828427
4,Octa,4.0,3.771236
5,RT5,5.0,4.714045
6,RT5+,5.007758,4.72136
7,RD,6.0,5.656854
8,RT,7.5,7.071068
9,Icosa,18.512296,17.45356


Practice with `df.loc[rows, columns]`.

In [131]:
vols_table.iloc[12]  # entire row

Shape              Cube
IVM Volume         24.0
XYZ Volume    22.627417
Name: 12, dtype: object

In [132]:
vols_table.iloc[0]  # entire row

Shape            Tetra
IVM Volume         1.0
XYZ Volume    0.942809
Name: 0, dtype: object

`df.loc[df['col1'] == value]`

In [137]:
vols_table.loc[vols_table['Shape'] == "SuperRT" ] # specific cell

Unnamed: 0,Shape,IVM Volume,XYZ Volume
11,SuperRT,21.213203,20.0


Now lets add some constituent modules that may be used to assemble the above shapes.

In [141]:
modules = np.array(["A","B", "T", "E", "S"],
                    dtype=np.str_) 

mods_col = pd.Series(modules, name="Shape")

mod_vols = np.array([1/24, 1/24, 1/24, E, (phi**-5) / 2],  dtype=np.float)
mod_vols_col = pd.Series(mod_vols, name="IVM Volume")

mods_table = pd.DataFrame({"Shape":mods_col, "IVM Volume":mod_vols_col})
# mods_table.index = mods_col

In [142]:
mods_table

Unnamed: 0,Shape,IVM Volume
0,A,0.041667
1,B,0.041667
2,T,0.041667
3,E,0.041731
4,S,0.045085


In [143]:
mods_table['XYZ Volume'] = mods_table['IVM Volume'] * 1/S3

In [144]:
mods_table

Unnamed: 0,Shape,IVM Volume,XYZ Volume
0,A,0.041667,0.039284
1,B,0.041667,0.039284
2,T,0.041667,0.039284
3,E,0.041731,0.039345
4,S,0.045085,0.042507


And now it's time to assemble the full table.

In [145]:
pd.concat([mods_table, vols_table])

Unnamed: 0,Shape,IVM Volume,XYZ Volume
0,A,0.041667,0.039284
1,B,0.041667,0.039284
2,T,0.041667,0.039284
3,E,0.041731,0.039345
4,S,0.045085,0.042507
0,Tetra,1.0,0.942809
1,Cubocta,2.5,2.357023
2,Icosa,2.917961,2.75108
3,Cube,3.0,2.828427
4,Octa,4.0,3.771236


In [163]:
CH = pd.concat([mods_table, vols_table])

In [164]:
CH = CH.reset_index(drop=True)

In [165]:
CH.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 18 entries, 0 to 17
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   Shape       18 non-null     object 
 1   IVM Volume  18 non-null     float64
 2   XYZ Volume  18 non-null     float64
dtypes: float64(2), object(1)
memory usage: 560.0+ bytes


In [166]:
# df['new'] = pd.Series(dtype='int')
CH['Comments'] = pd.Series(dtype='str_')

In [167]:
CH

Unnamed: 0,Shape,IVM Volume,XYZ Volume,Comments
0,A,0.041667,0.039284,
1,B,0.041667,0.039284,
2,T,0.041667,0.039284,
3,E,0.041731,0.039345,
4,S,0.045085,0.042507,
5,Tetra,1.0,0.942809,
6,Cubocta,2.5,2.357023,
7,Icosa,2.917961,2.75108,
8,Cube,3.0,2.828427,
9,Octa,4.0,3.771236,


In [170]:
CH.iloc[0, -1] = '24 make a Tetra'
CH.iloc[1, -1] = 'AAB = BAA = Mite'
CH.iloc[2, -1] = '1/120 RT5'
CH.iloc[3, -1] = '1/120 RT5+'
CH.iloc[4, -1] = '(φ**-5) / 2'
CH.iloc[5, -1] = "edges D, from 4 IVM balls"
CH.iloc[6, -1] = 'some faces flush with Octa 4'
CH.iloc[7, -1] = 'some faces flush with Octa 4'
CH.iloc[8, -1] = 'Duo-Tet, face diagonals = D'
CH.iloc[9, -1] = 'Dual of Cube, edges D'
CH.iloc[10, -1] = '120 T mods'
CH.iloc[11, -1] = '120 E mods'
CH.iloc[12, -1] = 'long diagonals = D, sphere domain'
CH.iloc[13, -1] = 'some vertexes shared with RD'
CH.iloc[14, -1] = 'edges = D'
CH.iloc[15, -1] = 'edges = D, 1F, 12-balls around nuclear ball'
CH.iloc[16, -1] = 'icosa of edges D + dual'
CH.iloc[17, -1] = 'face diagonals = 2D, 2F'

In [171]:
CH

Unnamed: 0,Shape,IVM Volume,XYZ Volume,Comments
0,A,0.041667,0.039284,24 make a Tetra
1,B,0.041667,0.039284,AAB = BAA = Mite
2,T,0.041667,0.039284,1/120 RT5
3,E,0.041731,0.039345,1/120 RT5+
4,S,0.045085,0.042507,(φ**-5) / 2
5,Tetra,1.0,0.942809,"edges D, from 4 IVM balls"
6,Cubocta,2.5,2.357023,some faces flush with Octa 4
7,Icosa,2.917961,2.75108,some faces flush with Octa 4
8,Cube,3.0,2.828427,"Duo-Tet, face diagonals = D"
9,Octa,4.0,3.771236,"Dual of Cube, edges D"
