# Comparison to SNoW

> Given a set of web tables and a target knowledge base, the SNoW method extends each web table with additional context columns, stitches matching web tables into larger tables, and applies functional dependency discovery to identify the relations that are represented in the web tables. Further, it normalises the stitched tables, guided by the schema of the knowledge base, to create an integrated schema.

We assume that the tables are already context-enriched, and schema-unioned per Pay-Level Domain. Our task is now to match and stitch these supertables into universal tables, and decompose them into normalised relations.

## Matching without FDs

In [6]:
%%time
from snow_pipeline import *
log.getLogger().setLevel(log.DEBUG)

snow_root = Path('~/snow/').expanduser().absolute()
kb = KB(snow_root)

Loading KB classes: 100%|██████████| 20/20 [00:03<00:00,  5.35it/s]
DEBUG:root:Made KB feature matrix of shape (20, 2336698)


CPU times: user 35.4 s, sys: 1.34 s, total: 36.7 s
Wall time: 37.7 s


In [2]:
%%time
from snow_pipeline import *
log.getLogger().setLevel(log.DEBUG)

snow_root = Path('~/snow/').expanduser().absolute()
benchmark_datasets = dict(get_snow_datasets(snow_root))
for name, ds in benchmark_datasets.items():
    print(f'{len(ds[0].fnames):3d}', name)

dataset_name = 'flightaware.com'
ts = list(takco.TableSet.dataset(benchmark_datasets[dataset_name][0]))#[-30:]
print(f"Loaded {dataset_name},  {len(ts)} tables")

 12 d3football.com
  8 www.vgchartz.com
213 www.cia.gov
 29 www.nndb.com
  6 flightaware.com
 76 itunes.apple.com
 74 seatgeek.com
 65 www.amoeba.com
 13 data.bls.gov
Loaded itunes.apple.com,  76 tables
CPU times: user 12.2 s, sys: 4.58 s, total: 16.8 s
Wall time: 12.7 s


In [24]:
%%time
tabid_df = preprocess_tables(ts)

CPU times: user 3.5 s, sys: 10.9 ms, total: 3.51 s
Wall time: 3.53 s


In [25]:
# Which features of these tables can we use to match them?
# predict_fkclasses(tabid_df, dataset_name, kb)
# for tabid, df in tabid_df.items():
#     for c, s in df.iteritems():
#         top = pd.Series(TfidfMatcher._analyzer(s.unique())).value_counts()[:3]
#         print(tabid, c, dict(top))

In [26]:
from snow_pipeline import *

matchers = [
    ExactHeadMatcher(include_context=True),
    TfidfMatcher(num_threshold=1, min_df=2),
    KBClassMatcher(kb),
]
agg_func = "KBClassMatcher * @max(ExactHeadMatcher, TfidfMatcher)"
agg_threshold_col = 0

partcols, idpairs = match_columns(tabid_df, matchers, agg_func = agg_func, agg_threshold_col=agg_threshold_col)
partcolid_to_colids = aggr_by_val(partcols.items())
partcolid_to_colids

Getting column text: 100%|██████████| 20/20 [00:02<00:00,  7.29it/s]
Extracting features: 100%|██████████| 224/224 [00:01<00:00, 112.83it/s]
DEBUG:root:Got (224, 5569) column features. Calculating similarities...
DEBUG:root:[KBClassMatcher] [6.json] Class predictions: ['uri 0:Person/6.72e-06', 'name:Single/8.36e-01', 'interpret:Band/4.91e-02', 'preis:Album/6.91e-06', 'NULL:Album/9.91e-07', 'uri 1 (album):Single/4.72e-02']
DEBUG:root:[KBClassMatcher] [60.json] Class predictions: ['uri 0:Single/1.93e-03', 'nome:Single/4.12e-01', 'album:Album/3.63e-01', 'artista:Athlete/5.39e-02', 'prezzo:Album/4.43e-04', 'NULL:Album/8.52e-04', 'uri 1 (album):Band/3.12e-02']
DEBUG:root:[KBClassMatcher] [61.json] Class predictions: ['NULL:Album/1.25e-03']
DEBUG:root:[KBClassMatcher] [62.json] Class predictions: ['uri 0:Album/7.58e-04', 'name:Single/8.59e-02', 'preis:Band/1.09e-05', 'NULL:Album/1.12e-03', 'uri 1 (podcast):Album/1.80e-03']
DEBUG:root:[KBClassMatcher] [63.json] Class predictions: ['uri 0:Comp

{11: {"6.json~Col0 ('page title',)",
  "60.json~Col0 ('page title',)",
  "62.json~Col0 ('page title',)",
  "63.json~Col0 ('page title',)",
  "65.json~Col0 ('page title',)",
  "67.json~Col0 ('page title',)",
  "69.json~Col0 ('page title',)",
  "7.json~Col0 ('page title',)",
  "71.json~Col0 ('page title',)",
  "72.json~Col0 ('page title',)",
  "8.json~Col0 ('page title',)",
  "9.json~Col0 ('page title',)"},
 10: {"6.json~Col1 ('table heading',)",
  "60.json~Col1 ('table heading',)",
  "62.json~Col1 ('table heading',)",
  "63.json~Col1 ('table heading',)",
  "65.json~Col1 ('table heading',)",
  "67.json~Col1 ('table heading',)",
  "69.json~Col1 ('table heading',)",
  "7.json~Col1 ('table heading',)",
  "71.json~Col1 ('table heading',)",
  "72.json~Col1 ('table heading',)",
  "8.json~Col1 ('table heading',)",
  "9.json~Col1 ('table heading',)"},
 20: {"6.json~Col10 ('uri 1 (album)',)",
  "60.json~Col11 ('uri 1 (album)',)",
  "63.json~Col11 ('uri 1 (album)',)",
  "65.json~Col11 ('uri 1 (alb

In [27]:
partid_to_tabids = partition_connected_components(tabid_df, partcolid_to_colids, idpairs)
for partid, tis in partid_to_tabids.items():
    print(f"part-{partid}", len(tis), tis)

part-0 12 {'62.json', '63.json', '67.json', '6.json', '65.json', '71.json', '60.json', '72.json', '9.json', '8.json', '69.json', '7.json'}
part-1 5 {'64.json', '74.json', '61.json', '73.json', '75.json'}
part-2 1 {'66.json'}
part-3 2 {'70.json', '68.json'}


In [28]:
from snow_pipeline import *

stitched = stitch_colclustered_tables(tabid_df, partcols, idpairs)
stitched_df = {}
for partid, (df, cols) in enumerate(stitched):
    df.columns = pd.MultiIndex.from_tuples(cols)
    stitched_df[f"part-{partid}"] = df
    print(*zip(*cols))
    display( df.sample( min(5, len(df)) ) )
#     if len(df.columns) > 10:
#         break

DEBUG:root:Stitching 12 aligned tables


('NULL', 'NULL', 'preis', 'uri 0', 'table heading', 'page title', 'duração', 'uri 3', 'uri 1 (album)', 'nome', 'artista', 'name', 'interpret', 'álbum', 'album', 'artiest', 'uri 1 (podcast)', 'beschreibung', 'erschienen', 'disambiguation of album', 'disambiguation of page title')


Unnamed: 0,NULL,NULL.1,preis,uri 0,table heading,page title,duração,uri 3,uri 1 (album),nome,...,name,interpret,álbum,album,artiest,uri 1 (podcast),beschreibung,erschienen,disambiguation of album,disambiguation of page title
2057,1,ver no itunes,"0,99 €",pt,,itunes - música - 2 future 4 u de armand van h...,8:04,id73277577,2 future 4 u,u don't know me (featuring duane harden),...,,,2 future 4 u,,,,,,,
830,1,ver en itunes,"0,99 €",es,,itunes - música - talk about love de the emeralds,3:27,id60596385,a girl who loves kurt cobain,talk about love,...,,,,,,,,,,
3679,12,ver en itunes,"1,29 €",es,,itunes - música - ladies and gentlemen de lou ...,3:17,id306966541,god is a woman,angelina,...,,,,,,,,,,
4880,in itunes ansehen,11,"1,29 €",de,,itunes - musik – „platinum & gold collection: ...,2:43,id291292244,platinum gold collection petula,,...,i know a place,petula clark,,,,,,,,
1221,5,ver no itunes,usd 0.99,br,,itunes - música - fallbrooke de fallbrooke,3:31,id327959592,condition response,losin' it,...,,,,,,,,,,


DEBUG:root:Stitching 5 aligned tables


('price', 'uri 3', 'table heading', 'page title', 'NULL', 'NULL', 'time', 'album', 'artist', 'name', 'uri 0', 'uri 1 (music video)', 'naam', 'uri 1 (mix)', 'uri 1 (album)', 'description', 'uri 1 (podcast)')


Unnamed: 0,price,uri 3,table heading,page title,NULL,NULL.1,time,album,artist,name,uri 0,uri 1 (music video),naam,uri 1 (mix),uri 1 (album),description,uri 1 (podcast)
10,$1.29,id5823141,,itunes - music - berlioz: overtures by san die...,view in itunes,1,7:21,the incredibles (music from the motion picture),michael giacchino,the incredits,,,,,,,
12,"€ 0,99",id337791439,,itunes - muziekvideo's - 'mess of me' van swit...,bekijk in itunes,3,3:51,the best yet,,,nl,mess of me,this is home,,,,
3,$0.99,id3327656,,"itunes - music - scriabin: symphony no. 1, rêv...",view in itunes,2,9:21,holst: the planets,london symphony orchestra,"the planets, op. 32: v. saturn, the bringer of...",,,,,,,
13,"€ 0,99",id337791439,,itunes - muziekvideo's - 'mess of me' van swit...,bekijk in itunes,4,4:13,the best yet,,,nl,mess of me,only hope,,,,
10,"€ 0,69",id337791439,,itunes - muziekvideo's - 'mess of me' van swit...,bekijk in itunes,1,2:30,oh! gravity.,,,nl,mess of me,oh! gravity.,,,,


DEBUG:root:Stitching 1 aligned tables


('page title', 'table heading', 'uri 0', 'uri 3', 'NULL', 'name', 'description', 'released', 'price', 'NULL', 'uri 1 (itunes u)')


Unnamed: 0,page title,table heading,uri 0,uri 3,NULL,name,description,released,price,NULL.1,uri 1 (itunes u)
2,duncan phillips lectures - download free conte...,,us,id567592969,3,peter doig,"mar 17, 2011",3/17/2011,free,view in itunes,duncan phillips lectures
10,a seminar on expository preaching - download f...,,us,id378880148,1,personal inadequacy: the story of jehoshaphat,--,4/20/2010,free,view in itunes,seminar on expository preaching
0,duncan phillips lectures - download free conte...,,us,id567592969,1,yve-alain bois,"sep 19, 2013",9/19/2013,free,view in itunes,duncan phillips lectures
7,duncan phillips lectures - download free conte...,,us,id567592969,8,robert storr,"dec 6, 2007",12/6/2007,free,view in itunes,duncan phillips lectures
5,duncan phillips lectures - download free conte...,,us,id567592969,6,eric fischl,"jul 16, 2009",7/16/2009,free,view in itunes,duncan phillips lectures


DEBUG:root:Stitching 2 aligned tables


('page title', 'table heading', 'uri 0', 'NULL', 'タイトル', 'アルバム', '時間', '価格', 'NULL', 'uri 1 (album)', 'uri 3', 'アーティスト', 'disambiguation of page title')


Unnamed: 0,page title,table heading,uri 0,NULL,タイトル,アルバム,時間,価格,NULL.1,uri 1 (album),uri 3,アーティスト,disambiguation of page title
7,itunes - musique - the fire in our throats wil...,,fr,afficher sur itunes,far from fields,city of echoes,5:17,"0,99 €",8,id458377565,,,
4,itunes - musique - the fire in our throats wil...,,fr,afficher sur itunes,city of echoes,city of echoes,7:05,"0,99 €",5,id458377565,,,
7,itunes - ミュージック - the clancy brothers & tommy ...,,jp,8,"puff, the magic dragon","the very best of peter, paul and mary",3:28,¥200,itunes で見る,irish gold re mastered,id296460263,"peter, paul & mary",re-mastered
4,itunes - ミュージック - the clancy brothers & tommy ...,,jp,5,leaving on a jet plane,"the very best of peter, paul and mary",3:27,¥150,itunes で見る,irish gold re mastered,id296460263,"peter, paul & mary",re-mastered
0,itunes - ミュージック - the clancy brothers & tommy ...,,jp,1,sundown,gord's gold,3:33,¥200,itunes で見る,irish gold re mastered,id296460263,ゴードン・ライトフット,re-mastered


In [29]:
df = stitched_df['part-0']
print(len(df))
sim = kb._get_sim(df)
display(df.describe().T)
print(list(df.columns[list(sim.columns)]))
sim.style.background_gradient()

22645


Unnamed: 0,count,unique,top,freq
,22645,62,in itunes ansehen,7284
,22645,133,ver no itunes,10519
preis,22645,25,"0,99 €",6884
uri 0,22645,12,br,8559
table heading,22645,1,,22645
page title,22645,1859,itunes - musik – „the very best of“ von julie ...,126
duração,22626,689,3:48,170
uri 3,22625,1840,id335530222,126
uri 1 (album),22621,1864,i love you,126
nome,15346,12289,intro,14


[('NULL',), ('uri 0',), ('table heading',), ('uri 1 (album)',), ('nome',)]


Unnamed: 0,1,3,4,8,9
Hospital,0.0,0.0,0.0,0.001833,0.018278
EducationalInstitution,1e-06,1e-06,0.0,0.005834,0.041967
Album,2.7e-05,2e-06,0.0,0.050681,0.359373
AdministrativeRegion,0.0,0.0,0.0,0.00528,0.034159
Band,8e-06,1e-06,0.0,0.03783,0.270095
Airline,0.0,0.0,0.0,0.000469,0.003688
Film,1.1e-05,1e-06,0.0,0.043142,0.316596
TelevisionShow,1.1e-05,1e-06,0.0,0.042857,0.323054
Single,2.9e-05,1e-06,0.0,0.053789,0.472119
Building,7e-06,1e-06,0.0,0.009468,0.078267


In [30]:
tabid_to_colnr_and_fkclass = predict_fkclasses(stitched_df, dataset_name, kb)
tabid_to_colnr_and_fkclass

DEBUG:root:[itunes.apple.com] [part-0] Class predictions: ['NULL:Single/2.86e-05', 'uri 0:Person/3.43e-06', 'uri 1 (album):Single/5.38e-02', 'nome:Single/4.72e-01']
DEBUG:root:[itunes.apple.com] [part-1] Class predictions: ['NULL:Album/2.77e-03', 'artist:Band/3.35e-02', 'uri 0:Company/8.10e-05']
DEBUG:root:[itunes.apple.com] [part-2] Class predictions: ['uri 0:Single/2.08e-04', 'price:Country/3.14e-04', 'NULL:Album/2.00e-03', 'uri 1 (itunes u):Artist/4.63e-03']
DEBUG:root:[itunes.apple.com] [part-3] Class predictions: ['uri 0:Airline/5.52e-05', 'NULL:Settlement/4.14e-03', 'タイトル:Film/6.85e-01', 'アルバム:Country/1.37e-01', 'uri 1 (album):Single/5.97e-04']


{'part-0': (9, 'Single'),
 'part-1': (8, 'Band'),
 'part-2': (10, 'Artist'),
 'part-3': (4, 'Film')}

In [31]:
nary_induction = True
decomposed = iter_decomposed(
    stitched_df, 
    dataset_name, 
    tabid_to_colnr_and_fkclass, 
    nary=nary_induction, 
    nary_stoplevel=2, 
    nary_minp= 0.95
)
for t in postprocess_tables(decomposed, numeric_threshold=0.5):
    print(f"{t._id}: {len(t.df)} rows")
    display( t.df.sample(min(len(t.df), 3)) )

DEBUG:root:[itunes.apple.com] [part-0] Decomposing class Single for col 9 (('nome',))
DEBUG:root:[itunes.apple.com] [part-0] Not decomposing context columns [('uri 0',), ('table heading',), ('page title',), ('uri 3',), ('uri 1 (album)',), ('uri 1 (podcast)',), ('disambiguation of album',), ('disambiguation of page title',)]
DEBUG:root:[itunes.apple.com] [part-0] Inferring FDs for [NULL|uri 0|uri 1 (album)|FK]
FD candidates: 100%|██████████| 7/7 [00:04<00:00,  1.59it/s]
DEBUG:root:[itunes.apple.com] [part-0] Got FD key [uri 1 (album)|NULL|FK] -> [NULL|preis|duração|artista|name|interpret|álbum|album|artiest|beschreibung|erschienen]


Single_itunes.apple.com_fd_0.json: 22377 rows


Unnamed: 0,NULL,uri 1 (album),NULL.1,FK
3379,5.0,charm,ver no itunes,Single_itunes.apple.com~Row11305
4689,9.0,dont you want me king britt,ver no itunes,Single_itunes.apple.com~Row3112
7683,28.0,hush,ver en itunes,Single_itunes.apple.com~Row5098


Single_itunes.apple.com_fd_1.json: 22177 rows


Unnamed: 0,preis,uri 1 (album),NULL,FK
5994,99.0,free spirits,ver no itunes,Single_itunes.apple.com~Row10556
10006,0.99,knock 3 times,ver no itunes,Single_itunes.apple.com~Row9793
11092,1.9,look into the future,2,Single_itunes.apple.com~Row0


Single_itunes.apple.com_fd_2.json: 22458 rows


Unnamed: 0,duração,uri 1 (album),NULL,FK
7519,509.0,hook up,ver no itunes,Single_itunes.apple.com~Row1164
12777,706.0,my little suede shoes,ver no itunes,Single_itunes.apple.com~Row330
2937,240.0,buying time,ver en itunes,Single_itunes.apple.com~Row1931


Single_itunes.apple.com_fd_3.json: 9480 rows


Unnamed: 0,artista,uri 1 (album),NULL,FK
7100,kevin spacey,some of these days,ver no itunes,Single_itunes.apple.com~Row3097
2015,james brown & his famous flames,doing it to death,ver en itunes,Single_itunes.apple.com~Row6305
4173,enter shikari,jonny sniper acid nation ep,ver en itunes,Single_itunes.apple.com~Row8187


Single_itunes.apple.com_fd_4.json: 7279 rows


Unnamed: 0,name,uri 1 (album),NULL,FK
5431,tailgate ramble,sounds of new orleans vol. 6,7.0,Single_itunes.apple.com~Row0
5452,lovable,specialty profiles sam cooke,7.0,Single_itunes.apple.com~Row0
5521,make it happen,step into the sunshine,6.0,Single_itunes.apple.com~Row0


Single_itunes.apple.com_fd_5.json: 7161 rows


Unnamed: 0,interpret,uri 1 (album),NULL,FK
1420,don cherry,dont let stars get in your,3.0,Single_itunes.apple.com~Row0
1829,glee cast,glee music presents warblers,11.0,Single_itunes.apple.com~Row0
1627,the antlers,familiars,1.0,Single_itunes.apple.com~Row0


Single_itunes.apple.com_fd_6.json: 5773 rows


Unnamed: 0,álbum,uri 1 (album),NULL,FK
4903,tea cozy hat,the queen,ver no itunes,Single_itunes.apple.com~Row9753
1981,the company you keep,i saw a stranger with your hair,ver no itunes,Single_itunes.apple.com~Row11444
1806,highway to hell (the ultimate ac/dc tribute),highway to hell,ver no itunes,Single_itunes.apple.com~Row9408


Single_itunes.apple.com_fd_7.json: 57 rows


Unnamed: 0,album,uri 1 (album),NULL,FK
1,the essential elvis presley (remastered),,view in itunes,Single_itunes.apple.com~Row0
5,beyoncé,blue jeans remixes ep,bekijk in itunes,Single_itunes.apple.com~Row12
42,heart 4 sale,riviera life,bekijk in itunes,Single_itunes.apple.com~Row7


Single_itunes.apple.com_fd_8.json: 25 rows


Unnamed: 0,artiest,uri 1 (album),NULL,FK
22,alex swings oscar sings!,riviera life,bekijk in itunes,Single_itunes.apple.com~Row7
8,kinky afro,id43463416,ver en itunes,Single_itunes.apple.com~Row0
7,clap your hands,id43463416,ver en itunes,Single_itunes.apple.com~Row0


Single_itunes.apple.com_fd_9.json: 19 rows


Unnamed: 0,beschreibung,uri 1 (album),NULL,FK
8,the penultimate episode of our tale.,,17.0,Single_itunes.apple.com~Row0
1,harding and martinez have the reckoning she th...,,10.0,Single_itunes.apple.com~Row0
3,jess harding finds herself in a very sticky si...,,12.0,Single_itunes.apple.com~Row0


Single_itunes.apple.com_fd_10.json: 19 rows


Unnamed: 0,erschienen,uri 1 (album),NULL,FK
8,,,17.0,Single_itunes.apple.com~Row0
6,,,15.0,Single_itunes.apple.com~Row0
2,,,11.0,Single_itunes.apple.com~Row0


DEBUG:root:[itunes.apple.com] [part-1] Decomposing class Band for col 8 (('artist',))
DEBUG:root:[itunes.apple.com] [part-1] Not decomposing context columns [('uri 3',), ('table heading',), ('page title',), ('uri 0',), ('uri 1 (music video)',), ('uri 1 (mix)',), ('uri 1 (album)',), ('uri 1 (podcast)',)]
DEBUG:root:[itunes.apple.com] [part-1] Inferring FDs for [NULL|album|FK|name|uri 0]
FD candidates: 100%|██████████| 11/11 [00:00<00:00, 26.00it/s]
DEBUG:root:[itunes.apple.com] [part-1] Got FD key [FK|name] -> [price|NULL|NULL|time|album|naam|description]


Band_itunes.apple.com_fd_0.json: 30 rows


Unnamed: 0,price,FK,name
13,0.99,Band_itunes.apple.com~Row7,"cantus arcticus, op. 61 (concerto for birds an..."
15,0.99,Band_itunes.apple.com~Row9,tico-tico
19,129.0,Band_itunes.apple.com~Row0,


Band_itunes.apple.com_fd_1.json: 28 rows


Unnamed: 0,NULL,FK,name
2,view in itunes,Band_itunes.apple.com~Row17,"piano sonata no. 14 in c-sharp minor, op. 27: ..."
1,view in itunes,Band_itunes.apple.com~Row15,cathy's theme from wuthering heights
19,1,Band_itunes.apple.com~Row1,


Band_itunes.apple.com_fd_2.json: 31 rows


Unnamed: 0,NULL,FK,name
25,2.0,Band_itunes.apple.com~Row0,boostie sample from the 2nd electro attack max...
30,4.0,Band_itunes.apple.com~Row22,earthquake weather
20,5.0,Band_itunes.apple.com~Row0,


Band_itunes.apple.com_fd_3.json: 40 rows


Unnamed: 0,time,FK,name
5,559.0,Band_itunes.apple.com~Row5,"the planets, op. 32: vi. uranus, the magician"
31,432.0,Band_itunes.apple.com~Row0,
11,256.0,Band_itunes.apple.com~Row4,"turandot, act iii: nessun dorma!"


Band_itunes.apple.com_fd_4.json: 31 rows


Unnamed: 0,album,FK,name
10,"orchestral music (nordic): rautavaara, e. - pi...",Band_itunes.apple.com~Row6,"cantus arcticus, op. 61, ""concerto for birds a..."
16,the incredibles (music from the motion picture),Band_itunes.apple.com~Row10,the incredits
12,pirates of the caribbean: swashbuckling sea songs,Band_itunes.apple.com~Row12,"yo, ho (a pirate's life for me)"


Band_itunes.apple.com_fd_5.json: 15 rows


Unnamed: 0,naam,FK,name
4,nothing else matters,Band_itunes.apple.com~Row0,
1,"nights in white satin (the night) [including ""...",Band_itunes.apple.com~Row0,
2,the day that never comes,Band_itunes.apple.com~Row0,


Band_itunes.apple.com_fd_6.json: 2 rows


Unnamed: 0,description,FK,name
1,"buy the new maxi single at itunes, amazon, jun...",Band_itunes.apple.com~Row0,boostie sample from the 3rd electro attack max...
0,"buy the new maxi single at itunes, amazon, jun...",Band_itunes.apple.com~Row0,boostie sample from the 2nd electro attack max...


DEBUG:root:[itunes.apple.com] [part-2] Decomposing class Artist for col 10 (('uri 1 (itunes u)',))
DEBUG:root:[itunes.apple.com] [part-2] Not decomposing context columns [('page title',), ('table heading',), ('uri 0',), ('uri 3',)]
DEBUG:root:[itunes.apple.com] [part-2] Inferring FDs for [uri 0|name|description|released|price|NULL|FK]
FD candidates: 100%|██████████| 22/22 [00:00<00:00, 61.79it/s]
DEBUG:root:[itunes.apple.com] [part-2] Got FD key [FK|name] -> [NULL|description|released|price|NULL]


Artist_itunes.apple.com_fd_0.json: 14 rows


Unnamed: 0,NULL,FK,name
9,1.0,Artist_itunes.apple.com~Row1,videopoverty and economic justice scholarship ...
7,1.0,Artist_itunes.apple.com~Row0,yve-alain bois
12,2.0,Artist_itunes.apple.com~Row3,videodean's lecture series: the fate of vulner...


Artist_itunes.apple.com_fd_1.json: 14 rows


Unnamed: 0,description,FK,name
11,--,Artist_itunes.apple.com~Row2,prepare the way of the lord: the story of john...
7,"sep 19, 2013",Artist_itunes.apple.com~Row0,yve-alain bois
6,"jul 16, 2009",Artist_itunes.apple.com~Row0,eric fischl


Artist_itunes.apple.com_fd_2.json: 14 rows


Unnamed: 0,released,FK,name
6,7/16/2009,Artist_itunes.apple.com~Row0,eric fischl
9,2/14/2011,Artist_itunes.apple.com~Row1,videopoverty and economic justice scholarship ...
12,10/1/2010,Artist_itunes.apple.com~Row3,videodean's lecture series: the fate of vulner...


DEBUG:root:[itunes.apple.com] [part-3] Decomposing class Film for col 4 (('タイトル',))
DEBUG:root:[itunes.apple.com] [part-3] Not decomposing context columns [('page title',), ('table heading',), ('uri 0',), ('uri 1 (album)',), ('uri 3',), ('disambiguation of page title',)]
DEBUG:root:[itunes.apple.com] [part-3] Inferring FDs for [uri 0|NULL|FK|アルバム|uri 1 (album)]
FD candidates: 100%|██████████| 11/11 [00:00<00:00, 44.28it/s]
DEBUG:root:[itunes.apple.com] [part-3] Got FD key [FK] -> [NULL|アルバム|時間|価格|NULL|アーティスト]


Film_itunes.apple.com_fd_0.json: 18 rows


Unnamed: 0,NULL,FK
16,5,Film_itunes.apple.com~Row4
12,2,Film_itunes.apple.com~Row1
8,afficher sur itunes,Film_itunes.apple.com~Row8


Film_itunes.apple.com_fd_1.json: 18 rows


Unnamed: 0,アルバム,FK
2,city of echoes,Film_itunes.apple.com~Row12
5,city of echoes,Film_itunes.apple.com~Row15
13,"greatest hits, vol. 1",Film_itunes.apple.com~Row6


Film_itunes.apple.com_fd_2.json: 18 rows


Unnamed: 0,時間,FK
8,509.0,Film_itunes.apple.com~Row8
1,506.0,Film_itunes.apple.com~Row11
9,720.0,Film_itunes.apple.com~Row9


Film_itunes.apple.com_fd_3.json: 18 rows


Unnamed: 0,価格,FK
11,200.0,Film_itunes.apple.com~Row0
17,200.0,Film_itunes.apple.com~Row7
6,99.0,Film_itunes.apple.com~Row16


Film_itunes.apple.com_fd_4.json: 18 rows


Unnamed: 0,NULL,FK
14,,Film_itunes.apple.com~Row5
17,,Film_itunes.apple.com~Row7
4,7.0,Film_itunes.apple.com~Row14


Film_itunes.apple.com_fd_5.json: 8 rows


Unnamed: 0,アーティスト,FK
4,ゴードン・ライトフット,Film_itunes.apple.com~Row5
7,"peter, paul & mary",Film_itunes.apple.com~Row7
3,ジェイムス・テイラー,Film_itunes.apple.com~Row6


Single_itunes.apple.com.json: 12290 rows


Unnamed: 0,PK,rdf-schema#label
9200,Single_itunes.apple.com~Row9200,i luv halloween
8628,Single_itunes.apple.com~Row8628,ten thousand fists
4756,Single_itunes.apple.com~Row4756,because i love her


Band_itunes.apple.com.json: 23 rows


Unnamed: 0,PK,rdf-schema#label
20,Band_itunes.apple.com~Row20,arcade fire
22,Band_itunes.apple.com~Row22,beck
18,Band_itunes.apple.com~Row18,mitsuko uchida


Artist_itunes.apple.com.json: 4 rows


Unnamed: 0,PK,rdf-schema#label
2,Artist_itunes.apple.com~Row2,seminar on expository preaching
1,Artist_itunes.apple.com~Row1,poverty economic justice scholarship
0,Artist_itunes.apple.com~Row0,duncan phillips lectures


Film_itunes.apple.com.json: 18 rows


Unnamed: 0,PK,rdf-schema#label
0,Film_itunes.apple.com~Row0,sundown
2,Film_itunes.apple.com~Row2,the boxer
3,Film_itunes.apple.com~Row3,the fields of athenry


INFO:root:[itunes.apple.com] Created tables for classes: {'Single': 11, 'Band': 7, 'Artist': 3, 'Film': 6}
