# Criterion analysis

In this notebook, we'll load a table of Criterion films pulled from [this page](https://www.criterion.com/shop/browse/list) into a data frame, clean up the director column a little and answer a few basic questions.

I like this data set _not only_ because Criterion puts out good movies and I'm interested in finding more of them to enjoy, but also because it's a relatively small data set (1,310 records, as of early 2019) that's good for demonstrating a common data cleaning technique that mixes automation with human judgement. The goal here is to end up with a data set that accurately lists the director(s) for each film.

If you're following along at home and you don't already have the data, you'll want to run [this script](scrape.py) to scrape the data -- uncomment the next cell and run it:

In [2]:
# %run -i 'scrape.py' 

In [3]:
import pandas as pd

In [4]:
# load the scraped data into a dataframe
df = pd.read_csv('criterion.csv')

In [5]:
# how many records?
print(len(df))

1310


In [6]:
df.head()

Unnamed: 0,spine_no,cover_img,title,url,director,country,year
0,482.0,https://s3.amazonaws.com/criterion-production/...,2 or 3 Things I Know About Her,https://www.criterion.com/films/1333-2-or-3-th...,Jean-Luc Godard,France,1967.0
1,657.0,https://s3.amazonaws.com/criterion-production/...,3:10 to Yuma,https://www.criterion.com/films/27910-3-10-to-...,Delmer Daves,United States,1957.0
2,327.0,https://s3.amazonaws.com/criterion-production/...,3 Films by Louis Malle,https://www.criterion.com/boxsets/397-3-films-...,,,
3,672.0,https://s3.amazonaws.com/criterion-production/...,3 Films by Roberto Rossellini Starring Ingrid ...,https://www.criterion.com/boxsets/982-3-films-...,,,
4,528.0,https://s3.amazonaws.com/criterion-production/...,3 Silent Classics by Josef von Sternberg,https://www.criterion.com/boxsets/744-3-silent...,,,


Which director appears most often? A quick `value_counts()` will tell the tale:

In [7]:
df.director.value_counts()

Ingmar Bergman              40
Akira Kurosawa              27
Yasujiro Ozu                20
Louis Malle                 17
Rainer Werner Fassbinder    15
Roberto Rossellini          12
Jean-Luc Godard             12
Jean Renoir                 11
Federico Fellini            11
Agnès Varda                 10
Satyajit Ray                10
Kenji Misumi                10
Nagisa Oshima                9
Luis Buñuel                  9
Josef von Sternberg          9
Jean-Pierre Melville         9
Aki Kaurismäki               9
Kenji Mizoguchi              9
François Truffaut            9
Samuel Fuller                8
Masaki Kobayashi             8
Ernst Lubitsch               8
Wim Wenders                  8
Alfred Hitchcock             8
Keisuke Kinoshita            7
Wes Anderson                 7
Robert Bresson               7
Seijun Suzuki                7
Michelangelo Antonioni       7
Steven Soderbergh            7
                            ..
Donna Deitch                 1
Frank Ca

Check for records where the director value is null:

In [8]:
no_dir = df[df.director.isnull()]

In [9]:
print(len(no_dir))

114


In [10]:
no_dir.head()

Unnamed: 0,spine_no,cover_img,title,url,director,country,year
2,327.0,https://s3.amazonaws.com/criterion-production/...,3 Films by Louis Malle,https://www.criterion.com/boxsets/397-3-films-...,,,
3,672.0,https://s3.amazonaws.com/criterion-production/...,3 Films by Roberto Rossellini Starring Ingrid ...,https://www.criterion.com/boxsets/982-3-films-...,,,
4,528.0,https://s3.amazonaws.com/criterion-production/...,3 Silent Classics by Josef von Sternberg,https://www.criterion.com/boxsets/744-3-silent...,,,
6,418.0,https://s3.amazonaws.com/criterion-production/...,4 by Agnès Varda,https://www.criterion.com/boxsets/8-4-by-agn-s...,,,
14,900.0,https://s3.amazonaws.com/criterion-production/...,100 Years of Olympic Films: 1912–2012,https://www.criterion.com/films/29360-100-year...,,,0.0


Looks like a lot of the records with no director listed instead list the director in the title (e.g. `3 Films by Louis Malle`. What else?

In [11]:
for title in sorted(no_dir.title.unique()):
    print(title)

100 Years of Olympic Films: 1912–2012
3 Films by Louis Malle
3 Films by Roberto Rossellini Starring Ingrid Bergman
3 Silent Classics by Josef von Sternberg
4 by Agnès Varda
A Film Trilogy by Ingmar Bergman
A Story of Floating Weeds/Floating Weeds: Two Films by Yasujiro Ozu
A Whit Stillman Trilogy: Metropolitan, Barcelona, The Last Days of Disco
AK 100: 25 Films by Akira Kurosawa
America Lost and Found: The BBS Story
Andrzej Wajda: Three War Films
André Gregory & Wallace Shawn: 3 Films
By Brakhage: An Anthology, Volumes One and Two
Carl Theodor Dreyer Box Set
Classic Hitchcock
David Lean Directs Noël Coward
Dietrich & von Sternberg in Hollywood
Eclipse Series 10: Silent Ozu—Three Family Comedies
Eclipse Series 11: Larisa Shepitko
Eclipse Series 12: Aki Kaurismäki’s Proletariat Trilogy
Eclipse Series 13: Kenji Mizoguchi’s Fallen Women
Eclipse Series 14: Rossellini’s History Films—Renaissance and Enlightenment
Eclipse Series 15: Travels with Hiroshi Shimizu
Eclipse Series 16: Alexander Ko

### Cleaning up the director data

114 out of 1,310 records isn't bad, but the idea here is to be as completist as possible.

At first, I started to write a function to catch/replace patterns in a series of increasingly complicated regular expressions, but that suuuuuucked.

Easier, I think, to make a big dictionary with titles as keys and director info, if available, as the value -- I created the dictionary in a text editor with regex + find+replace and dropped in the correct director information manually (sometimes after a little Internet sleuthing).

Films with multiple directors, typically box sets or compilations, are listed as `Various`.

In [12]:
director_dict = {
    '3 Films by Louis Malle': 'Louis Malle',
    '3 Films by Roberto Rossellini Starring Ingrid Bergman': 'Roberto Rossellini',
    '3 Silent Classics by Josef von Sternberg': 'Josef von Sternberg',
    '4 by Agnès Varda': 'Agnès Varda',
    'A Film Trilogy by Ingmar Bergman': 'Ingmar Bergman',
    'A Story of Floating Weeds/Floating Weeds: Two Films by Yasujiro Ozu': 'Yasujiro Ozu',
    'A Whit Stillman Trilogy: Metropolitan, Barcelona, The Last Days of Disco': 'Whit Stillman',
    'AK 100: 25 Films by Akira Kurosawa': 'Akira Kurosawa',
    'Andrzej Wajda: Three War Films': 'Andrzej Wajda',
    'André Gregory & Wallace Shawn: 3 Films': 'André Gregory & Wallace Shawn',
    'By Brakhage: An Anthology, Volumes One and Two': 'Stan Brakhage',
    'Carl Theodor Dreyer Box Set': 'Carl Theodor Dreyer',
    'Classic Hitchcock': 'Alfred Hitchcock',
    'David Lean Directs Noël Coward': 'David Lean',
    'Dietrich & von Sternberg in Hollywood': 'Josef von Sternberg',
    'Eclipse Series 10: Silent Ozu—Three Family Comedies': 'Yasujiro Ozu',
    'Eclipse Series 11: Larisa Shepitko': 'Larisa Shepitko',
    'Eclipse Series 12: Aki Kaurismäki’s Proletariat Trilogy': 'Aki Kaurismäki',
    'Eclipse Series 13: Kenji Mizoguchi’s Fallen Women': 'Kenji Mizoguchi',
    'Eclipse Series 14: Rossellini’s History Films—Renaissance and Enlightenment': 'Roberto Rossellini',
    'Eclipse Series 15: Travels with Hiroshi Shimizu': 'Hiroshi Shimizu',
    'Eclipse Series 16: Alexander Korda’s Private Lives': 'Alexander Korda',
    'Eclipse Series 18: Dušan Makavejev—Free Radical': 'Dušan Makavejev',
    'Eclipse Series 19: Chantal Akerman in the Seventies': 'Chantal Akerman',
    'Eclipse Series 1: Early Bergman': 'Ingmar Bergman',
    'Eclipse Series 20: George Bernard Shaw on Film': 'Gabriel Pascal',
    'Eclipse Series 21: Oshima’s Outlaw Sixties': 'Nagisa Oshima',
    'Eclipse Series 22: Presenting Sacha Guitry': 'Sacha Guitry',
    'Eclipse Series 23: The First Films of Akira Kurosawa': 'Akira Kurosawa',
    'Eclipse Series 24: The Actuality Dramas of Allan King': 'Allan King',
    'Eclipse Series 25: Basil Dearden’s London Underground': 'Basil Dearden',
    'Eclipse Series 26: Silent Naruse': 'Mikio Naruse',
    'Eclipse Series 27: Raffaello Matarazzo’s Runaway Melodramas': 'Raffaello Matarazzo',
    'Eclipse Series 28: The Warped World of Koreyoshi Kurahara': 'Koreyoshi Kurahara',
    'Eclipse Series 29: Aki Kaurismäki’s Leningrad Cowboys': 'Aki Kaurismäki’',
    'Eclipse Series 2: The Documentaries of Louis Malle': 'Louis Malle',
    'Eclipse Series 30: Sabu!': 'Zoltán Korda',
    'Eclipse Series 31: Three Popular Films by Jean-Pierre Gorin': 'Jean-Pierre Gorin',
    'Eclipse Series 33: Up All Night with Robert Downey Sr.': 'Robert Downey Sr.',
    'Eclipse Series 34: Jean Grémillon During the Occupation': 'Jean Grémillon',
    'Eclipse Series 35: Maidstone and Other Films by Norman Mailer': 'Norman Mailer',
    'Eclipse Series 38: Masaki Kobayashi Against the System': 'Masaki Kobayashi',
    'Eclipse Series 39: Early Fassbinder': 'Rainer Werner Fassbinder',
    'Eclipse Series 3: Late Ozu': 'Yasujiro Ozu',
    'Eclipse Series 40: Late Ray': 'Satyajit Ray',
    'Eclipse Series 41: Kinoshita and World War II': 'Keisuke Kinoshita',
    'Eclipse Series 42: Silent Ozu—Three Crime Dramas': 'Yasujiro Ozu',
    'Eclipse Series 43: Agnès Varda in California': 'Agnès Varda',
    'Eclipse Series 44: Julien Duvivier in the Thirties': 'Julien Duvivier',
    'Eclipse Series 45: Claude Autant-Lara—Four Romantic Escapes from Occupied France': 'Claude Autant-Lara',
    'Eclipse Series 4: Raymond Bernard': 'Raymond Bernard',
    'Eclipse Series 5: The First Films of Samuel Fuller': 'Samuel Fuller',
    'Eclipse Series 6: Carlos Saura’s Flamenco Trilogy': 'Carlos Saura',
    'Eclipse Series 7: Postwar Kurosawa': 'Akira Kurosawa',
    'Eclipse Series 8: Lubitsch Musicals': 'Ernst Lubitsch',
    'Eclipse Series 9: The Delirious Fictions of William Klein': 'William Klein',
    'Eisenstein: The Sound Years': 'Sergei Eisenstein',
    'Fanny and Alexander Box Set': 'Ingmar Bergman',
    'Gates of Heaven/Vernon, Florida': 'Errol Morris',
    'Grey Gardens The Beales of Grey Gardens Box Set': 'Albert Maysles and David Maysles',
    'I Am Curious . . . Box set': 'Vilgot Sjöman',
    'Ingmar Bergman’s Cinema': 'Ingmar Bergman',
    'John Cassavetes: Five Films': 'John Cassavetes',
    'La Jetée/Sans Soleil': 'Chris Marker',
    'Letters from Fontainhas: Three Films by Pedro Costa': 'Pedro Costa',
    'Olivier’s Shakespeare': 'Laurence Olivier',
    'Pierre Etaix': 'Pierre Etaix',
    'Pigs, Pimps & Prostitutes: 3 Films by Shohei Imamura': 'Shohei Imamura',
    'Police Story/Police Story 2': 'Jackie Chan',
    'Roberto Rossellini’s War Trilogy': 'Roberto Rossellini',
    'Six Moral Tales': 'Eric Rohmer',
    'Stage and Spectacle: Three Films by Jean Renoir': 'Jean Renoir',
    'The Adventures of Antoine Doinel': 'François Truffaut',
    'The Apu Trilogy': 'Satyajit Ray',
    'The BRD Trilogy': 'Rainer Werner Fassbinder',
    'The Before Trilogy': 'Richard Linklater',
    'The Complete Jacques Tati': 'Jacques Tati',
    'The Complete Jean Vigo': 'Jean Vigo',
    'The Complete Lady Snowblood': 'Toshiya Fujita',
    'The Complete Monterey Pop Festival': 'D.A. Pennebaker and Chris Hegedus',
    'The Emigrants/The New Land': 'Jan Troell',
    'The Essential Jacques Demy': 'Jacques Demy',
    'The Lower Depths': 'Jean Renoir and Akira Kurosawa',
    'The Marseille Trilogy': 'Marcel Pagnol',
    'The Only Son/There Was a Father: Two Films by Yasujiro Ozu': 'Yasujiro Ozu',
    'The Orphic Trilogy': 'Jean Cocteau',
    'The Qatsi Trilogy': 'Godfrey Reggio',
    'The Samurai Trilogy': 'Hiroshi Inagaki',
    'The Shooting/Ride in the Whirlwind': 'Monte Hellman',
    'Three Colors': 'Krzysztof Kieślowski',
    'Three Films by Hiroshi Teshigahara': 'Hiroshi Teshigahara',
    'Trilogy of Life': 'Pier Paolo Pasolini',
    'Trilogía de Guillermo del Toro': 'Guillermo del Toro',
    'Wim Wenders: The Road Trilogy': 'Wim Wenders',
    'Yojimbo/Sanjuro Box Set': 'Akira Kurosawa',
    '100 Years of Olympic Films: 1912–2012': 'Various',
    'America Lost and Found: The BBS Story': 'Various',
    'Eclipse Series 17: Nikkatsu Noir': 'Various',
    'Eclipse Series 32: Pearls of the Czech New Wave': 'Various',
    'Eclipse Series 36: Three Wicked Melodramas from Gainsborough Pictures': 'Various',
    'Eclipse Series 37: When Horror Came to Shochiku': 'Various',
    'Eclipse Series 46: Ingrid Bergman’s Swedish Years': 'Various',
    'Essential Art House: 50 Years of Janus Films': 'Various',
    'The Golden Age of Television': 'Various',
    'Great Adaptations': 'Various',
    'The Killers': 'Various',
    'Lone Wolf and Cub': 'Various',
    'Martin Scorsese’s World Cinema Project': 'Various',
    'Martin Scorsese’s World Cinema Project No. 2': 'Various',
    'Monsters and Madmen': 'Various',
    'Paul Robeson: Portraits of the Artist': 'Various',
    'Rebel Samurai: Sixties Swordplay Classics': 'Various',
    'The Rock Box': 'Various',
    'Zatoichi: The Blind Swordsman': 'Various'
}

Next, because you never want to overwrite original data, copy over the director info into a new column:

In [13]:
df['director_fix'] = df['director']

In [14]:
df.head()

Unnamed: 0,spine_no,cover_img,title,url,director,country,year,director_fix
0,482.0,https://s3.amazonaws.com/criterion-production/...,2 or 3 Things I Know About Her,https://www.criterion.com/films/1333-2-or-3-th...,Jean-Luc Godard,France,1967.0,Jean-Luc Godard
1,657.0,https://s3.amazonaws.com/criterion-production/...,3:10 to Yuma,https://www.criterion.com/films/27910-3-10-to-...,Delmer Daves,United States,1957.0,Delmer Daves
2,327.0,https://s3.amazonaws.com/criterion-production/...,3 Films by Louis Malle,https://www.criterion.com/boxsets/397-3-films-...,,,,
3,672.0,https://s3.amazonaws.com/criterion-production/...,3 Films by Roberto Rossellini Starring Ingrid ...,https://www.criterion.com/boxsets/982-3-films-...,,,,
4,528.0,https://s3.amazonaws.com/criterion-production/...,3 Silent Classics by Josef von Sternberg,https://www.criterion.com/boxsets/744-3-silent...,,,,


... and apply the fix:

In [15]:
df['director_fix'] = df.apply(lambda row: director_dict.get(row.title) or row.director, axis=1)

In [16]:
df.head()

Unnamed: 0,spine_no,cover_img,title,url,director,country,year,director_fix
0,482.0,https://s3.amazonaws.com/criterion-production/...,2 or 3 Things I Know About Her,https://www.criterion.com/films/1333-2-or-3-th...,Jean-Luc Godard,France,1967.0,Jean-Luc Godard
1,657.0,https://s3.amazonaws.com/criterion-production/...,3:10 to Yuma,https://www.criterion.com/films/27910-3-10-to-...,Delmer Daves,United States,1957.0,Delmer Daves
2,327.0,https://s3.amazonaws.com/criterion-production/...,3 Films by Louis Malle,https://www.criterion.com/boxsets/397-3-films-...,,,,Louis Malle
3,672.0,https://s3.amazonaws.com/criterion-production/...,3 Films by Roberto Rossellini Starring Ingrid ...,https://www.criterion.com/boxsets/982-3-films-...,,,,Roberto Rossellini
4,528.0,https://s3.amazonaws.com/criterion-production/...,3 Silent Classics by Josef von Sternberg,https://www.criterion.com/boxsets/744-3-silent...,,,,Josef von Sternberg


Did we miss any?

In [17]:
print(len(df[df.director_fix.isnull()]))

0


Whoop whoop! Now check for misspellings, typos, etc., and fix using a similar process:

In [18]:
for director in sorted(df[df.director_fix.notnull()].director_fix.unique()):
    print(director)

Abbas Kiarostami
Abdellatif Kechiche
Agnieszka Smoczyńska
Agnès Varda
Ahmed El Maânouni
Aki Kaurismäki
Aki Kaurismäki’
Akira Inoue
Akira Kurosawa
Al Reinert
Alain Resnais
Albert Brooks
Albert Lamorisse
Albert Maysles and David Maysles
Albert Maysles…
Alberto Lattuada
Alex Cox
Alexander Hall
Alexander Korda
Alexander Mackendrick
Alexander Payne
Alf Sjöberg
Alfonso Cuarón
Alfred Hitchcock
Allan King
Allen Baron
Allison Anders…
Anatole Litvak
Andrea Arnold
Andrei Tarkovsky
Andrew Haigh
Andrzej Wajda
André Gregory & Wallace Shawn
Ang Lee
Anthony Asquith
Anthony Asquith…
Anthony Mann
Antonio Pietrangeli
Apichatpong Weerasethakul
Arnaud Desplechin
Arthur Crabtree
Arthur Hiller
Barbara Kopple
Barbara Loden
Barbet Schroeder
Basil Dearden
Benjamin Christensen
Bernardo Bertolucci
Bernhard Wicki
Bertrand Tavernier
Billy Wilder
Bob Fosse
Bob Rafelson
Brian De Palma
Bruce Beresford
Bruce Robinson
Bruno Dumont
Buichi Saito
Byron Haskin
Carl Th. Dreyer
Carl Theodor Dreyer
Carlos Reygadas
Carlos Saura

In [19]:
director_typos = {
    'Aki Kaurismäki’': 'Aki Kaurismäki',
    'Carl Th. Dreyer': 'Carl Theodor Dreyer',
    'Charles Chaplin': 'Charlie Chaplin',
    'D. A. Pennebaker': 'D.A. Pennebaker'
}

In [20]:
def fix_typos(row):
    clean_director = director_typos.get(row.director_fix) or row.director_fix
    return clean_director.replace('…', '').strip()

In [21]:
df.director_fix = df.apply(fix_typos, axis=1)

Let's re-check the `value_counts()`:

In [22]:
df.director_fix.value_counts()

Ingmar Bergman                       44
Akira Kurosawa                       30
Yasujiro Ozu                         25
Various                              22
Louis Malle                          19
Rainer Werner Fassbinder             17
Roberto Rossellini                   15
Jean-Luc Godard                      13
Agnès Varda                          12
Satyajit Ray                         12
Federico Fellini                     12
Aki Kaurismäki                       11
Josef von Sternberg                  11
Jean Renoir                          11
Kenji Misumi                         10
Nagisa Oshima                        10
Kenji Mizoguchi                      10
François Truffaut                    10
Masaki Kobayashi                      9
Ernst Lubitsch                        9
Jean-Pierre Melville                  9
Alfred Hitchcock                      9
David Lean                            9
Luis Buñuel                           9
Wim Wenders                           9


No change, really, except we have more accurate numbers.

Next up: What's up in Japan? (I love Kurosawa and Kobayashi is a friggin _god_, who else should I get to know?).

First up, loop over non-null countries and print:

In [23]:
for country in sorted(df[df.country.notnull()].country.unique()):
    print(country)

Argentina
Australia
Austria
Bangladesh
Belgium
Brazil
Canada
China
Cuba
Czechoslovakia
Denmark
Finland
France
Germany
Guatemala
Hong Kong
India
Iran
Ireland
Italy
Japan
Kazakhstan
Macedonia
Mexico
Morocco
Netherlands
New Zealand
Norway
Philippines
Poland
Portugal
Romania
Senegal
South Korea
Soviet Union
Spain
Sweden
Taiwan
Thailand
Turkey
United Kingdom
United States
West Germany
Yugoslavia


(An exercise for another time: Finding and fixing films that don't have a country of origin.)

I'd like to look at a list of Japanese films like this:
`year,director,title`

... sorted first by director, then by year, so I can check out a director's Criterion output chronologically:

In [25]:
japan = df[df.country == 'Japan']

In [26]:
len(japan)

179

In [37]:
for x in japan[['title', 'director_fix', 'year']].sort_values(['director_fix', 'year']).iterrows():
    print('{}  {:<32}  {}'.format(str(int(x[1].year)), x[1].director_fix, x[1].title))

2012  Abbas Kiarostami                  Like Someone in Love
1965  Akira Inoue                       Zatoichi’s Revenge
1943  Akira Kurosawa                    Sanshiro Sugata
1944  Akira Kurosawa                    The Most Beautiful
1945  Akira Kurosawa                    The Men Who Tread on the Tiger’s Tail
1945  Akira Kurosawa                    Sanshiro Sugata, Part Two
1946  Akira Kurosawa                    No Regrets for Our Youth
1947  Akira Kurosawa                    One Wonderful Sunday
1948  Akira Kurosawa                    Drunken Angel
1949  Akira Kurosawa                    Stray Dog
1950  Akira Kurosawa                    Rashomon
1950  Akira Kurosawa                    Scandal
1951  Akira Kurosawa                    The Idiot
1952  Akira Kurosawa                    Ikiru
1954  Akira Kurosawa                    Seven Samurai
1955  Akira Kurosawa                    I Live in Fear
1957  Akira Kurosawa                    Throne of Blood
1958  Akira Kurosawa             

Beauty. What about movies with "samurai" in the title?

In [43]:
japan[japan.title.str.contains('samurai', case=False)].sort_values('year')

Unnamed: 0,spine_no,cover_img,title,url,director,country,year,director_fix
998,14.0,https://s3.amazonaws.com/criterion-production/...,Samurai I: Musashi Miyamoto,https://www.criterion.com/films/529-samurai-i-...,Hiroshi Inagaki,Japan,1954.0,Hiroshi Inagaki
1024,2.0,https://s3.amazonaws.com/criterion-production/...,Seven Samurai,https://www.criterion.com/films/165-seven-samurai,Akira Kurosawa,Japan,1954.0,Akira Kurosawa
996,15.0,https://s3.amazonaws.com/criterion-production/...,Samurai II: Duel at Ichijoji Temple,https://www.criterion.com/films/530-samurai-ii...,Hiroshi Inagaki,Japan,1955.0,Hiroshi Inagaki
997,16.0,https://s3.amazonaws.com/criterion-production/...,Samurai III: Duel at Ganryu Island,https://www.criterion.com/films/531-samurai-ii...,Hiroshi Inagaki,Japan,1956.0,Hiroshi Inagaki
1149,596.0,https://s3.amazonaws.com/criterion-production/...,Three Outlaw Samurai,https://www.criterion.com/films/27734-three-ou...,Hideo Gosha,Japan,1964.0,Hideo Gosha
1000,312.0,https://s3.amazonaws.com/criterion-production/...,Samurai Spy,https://www.criterion.com/films/762-samurai-spy,Masahiro Shinoda,Japan,1965.0,Masahiro Shinoda
999,310.0,https://s3.amazonaws.com/criterion-production/...,Samurai Rebellion,https://www.criterion.com/films/753-samurai-re...,Masaki Kobayashi,Japan,1967.0,Masaki Kobayashi


I had never heard of [Zatoichi](https://en.wikipedia.org/wiki/Zatoichi) until poking at this data. What's there?

In [44]:
japan[japan.title.str.contains('zatoichi', case=False)].sort_values(['year', 'director_fix'])

Unnamed: 0,spine_no,cover_img,title,url,director,country,year,director_fix
1112,,https://s3.amazonaws.com/criterion-production/...,The Tale of Zatoichi Continues,https://www.criterion.com/films/28301-the-tale...,Kazuo Mori,Japan,1962.0,Kazuo Mori
1111,,https://s3.amazonaws.com/criterion-production/...,The Tale of Zatoichi,https://www.criterion.com/films/28300-the-tale...,Kenji Misumi,Japan,1962.0,Kenji Misumi
1296,,https://s3.amazonaws.com/criterion-production/...,Zatoichi on the Road,https://www.criterion.com/films/28304-zatoichi...,Kimiyoshi Yasuda,Japan,1963.0,Kimiyoshi Yasuda
822,,https://s3.amazonaws.com/criterion-production/...,New Tale of Zatoichi,https://www.criterion.com/films/28302-new-tale...,Tokuzo Tanaka,Japan,1963.0,Tokuzo Tanaka
1304,,https://s3.amazonaws.com/criterion-production/...,Zatoichi the Fugitive,https://www.criterion.com/films/28303-zatoichi...,Tokuzo Tanaka,Japan,1963.0,Tokuzo Tanaka
1287,,https://s3.amazonaws.com/criterion-production/...,Zatoichi and the Chest of Gold,https://www.criterion.com/films/28305-zatoichi...,Kazuo Ikehiro,Japan,1964.0,Kazuo Ikehiro
1299,,https://s3.amazonaws.com/criterion-production/...,Zatoichi’s Flashing Sword,https://www.criterion.com/films/28306-zatoichi...,Kazuo Ikehiro,Japan,1964.0,Kazuo Ikehiro
413,,https://s3.amazonaws.com/criterion-production/...,"Fight, Zatoichi, Fight",https://www.criterion.com/films/28307-fight-za...,Kenji Misumi,Japan,1964.0,Kenji Misumi
19,,https://s3.amazonaws.com/criterion-production/...,Adventures of Zatoichi,https://www.criterion.com/films/28308-adventur...,Kimiyoshi Yasuda,Japan,1964.0,Kimiyoshi Yasuda
1301,,https://s3.amazonaws.com/criterion-production/...,Zatoichi’s Revenge,https://www.criterion.com/films/28309-zatoichi...,Akira Inoue,Japan,1965.0,Akira Inoue


```
|￣￣￣￣￣￣￣￣￣|
|    w h a t    |
|    t i m e    |
|      i s      |
|     i t ?     |
|               |
|               |
| l i b r a r y |
|    t i m e    |
|＿＿＿＿＿＿＿＿＿| 
(\__/) || 
(•ㅅ•) || 
/ 　 づ
```