# Cleaner

This notebook is used to separate the rundata text from the codes and map the codes to other information found in the different files provided by rundata.

The first step is to load the data we want to merge into data_frames. The deeper cleaning process which separates codes and the runic text can be
found in `rundata_utils.py`.

In [8]:
from rundata_utils import get_table_from_text, get_dataframe_from_excel

df_en = get_table_from_text('ENGLISH')
df_runx = get_table_from_text('RUNTEXTX')

df_rd = get_dataframe_from_excel()


Merge the dataframes and rename the columns to English.

In [12]:
import pandas as pd

merged_df = pd.merge(df_runx, df_en, on='Signum', suffixes=('_runx', '_en'))
merged_df = pd.merge(merged_df, df_rd[['Signum','Plats','Stilgruppering', 'Period/Datering']], on='Signum')

merged_df = merged_df.rename(columns={
    'Text_en': 'English', 
    'Text_runx': 'Transliteration', 
    'Signum': 'Code',
    'Plats': 'Location',
    'Stilgruppering': 'Style Grouping',
    'Period/Datering': 'Dating'
})
merged_df

Unnamed: 0,Code,Transliteration,English,Location,Style Grouping,Dating
0,Öl 1 $,§A s-a... --s- ias satr aiftir siba kuþa sun f...,§A This stone is set up in memory of Sibbi Góð...,Karlevi,RAK,V s 900-t
1,Öl 2 †$,tot-- þ-a- k--kaR ---- ...--- -tain iftiR sabi...,"Dóttir(?), Þegn(?), ...-geirr(?) [had the] sto...",Algutsrums kyrka,Pr3,V
2,Öl 3 †$,...iR bryþr litu r-isa ... ...ftiR ...-----s--...,These brothers had the [stones] raised in memo...,Resmo kyrka,Pr3 - Pr4?,V efter 1050
3,Öl 4 $,...-abi þaiR --tu raisa stein- eftiR rantui mo...,<...-abi> they had the stones raised in memory...,Resmo kyrka,Pr4,V efter 1050
4,Öl 5 †,alti auk keti... ... stein eftiR kata faþur sin,"Aldi and Ketill, (they had) the stone (raised)...",Bårby,Pr3,V
...,...,...,...,...,...,...
5162,UA Fv1914;47 $,krani kerþi half þisi iftir kal filaka sin,"Grani made this vault in memory of Karl/Káll, ...",Berezanj,,V
5163,By Fv1970;248,alftan ---t----l-a-----,Halfdan ...,Hagia Sofia,,V/M
5164,By NOR1999;26,arni,Árni,Hagia Sofia,,V?
5165,By NT1984;32 $,§A ... hiaku þir hilfniks min -----... ... en ...,"§A ... they cut(?), the troops men ... but in ...","Porto Leone, Pireus",,V


Write out the merged data as a csv file.

In [13]:
merged_df.to_csv('data/processed/merged.csv', index=False)