Mostly file-names of movies are encoded so as to transmit maximum information with it. This repo contains code in different languages to parse these names and get information out of them.
movie information extraction

Continous Integration
Build Status

Code coverage:
Coverage Status

code to extract movie name out of file name

step 1: regex filter 1: to remove all data in () and [] after extracting information like season no / episode no from it. Regular expressions used are:


step 2: Now a movie name may look like:


Replace the "." with " " or white space to get name like

Iron Man 3 213 3D 18p

Now the movie is of From

[name] [version] [year] [3D/2D] [resolution]

So now we need to parse this kind of name to get all information!

Parsing can be done using different regex filters

Regex 1: to identify movie part no like in Iron Man 3

/\b\d /i

Regex 2: to determine the year like Iron man 213 or Iron Man 2013


Regex 3: to determine dimensions like Iron man 3D or Iron Man 3d


Regex 4: to determine Resolution like Iron man 18p or Iron Man 1080p


And from this we can generate information like

  Filename: Iron.Man.3.213.3D.18p
  Title: Iron Man 3
  Part: 3
  Year: 213
  Resolution: 18p
  Dimension: 3D

