# Introduction

A movie poster is one of the most important communication tools between general audiences and movie creators. It sets the first impression in only seconds. 

<img src="Star_War_Poster.jpg" width="200">

One significant element of movie poster design is the brightness of a poster. Different brightness can lead to different emotional reactions among audiences. Therefore, for some movie genres, there are certain stereotypical brightness patterns. For example, the posters for horror movies are often dark (see the new horror movie “IT”) and use black as the dominant color. With this type of design, audiences can feel the depressing and intense atmosphere just by looking at the poster. While comedies on the other hand are much lighter in their poster design (see the classic comedy “Home Alone”). A design with bright colors puts people at ease.

Horror|Comedy  
- | - 
<img src="IT_Poster.jpg" width="235"> | <img src="Home_Alone_Poster.jpg" width="200">

Sometimes an unconventional design can make audiences even more interested. The all-time classic thriller movie “The Silence of the Lambs” did not use the most common all dark poster, and instead used a large proportion of white with a small dark corner. 

<img src="Silence_Lamb_Poster.jpg" width="200">

One cannot ascribe the success of a movie solely to the design of the poster, but for a multi-billion dollar industry where a small improvement could lead to a great return for investors, there is great merit in investigating the relationship between the brightness of a poster and the success of the movie. Specifically, when a poster’s brightness deviates from other movies with the same genre, would it be a positive or negative sign for a movie? 
As a result, this study strives to answer three research questions: <br>
<br>
1)	Which movie genre(s) has (/have) a preferred brightness design?<br>
<br>
2)	Does deviating from typical poster brightness design lead to a positive or negative effect on movie success? Does the result remain the same for each movie genre?<br>
<br>
3)	Do the results for question 1) and 2) remain consistent over the years? 


# Method
## Data
**9,953 U.S released movies from 1971 to 2016**<br>
**All data were scraped from IMDB (root URL: https://www.imdb.com/list/ls057823854/):**<br>
Movie title<br>
Release year<br>
Movie genre<br>
MPAA film rating (e.g., PG, PG-13, R)<br>
Movie budget<br>
Box-office gross<br>
IMDB rating<br>
Meta-score (i.e., critic rating)<br>
Poster brightness index (i.e., Partially converting from RGB to HSV with only the "V" used. The mean of all pixels’ V values was defined as the brightness of that movie poster. A higher number implies a brighter design.)<br>
## Analysis
**Dependent variable**<br>
Box-office gross<br>
Movie profit: $BoxOffice - Budget$
IMDB rating<br>
Meta-score<br>
**Independent variable**<br>
Deviance from the expected brightness (Deviance)<br>

$$Deviance  = Brt_{iG_{1...J}}-E(Brt_{iG_{1...J}})$$
$$= Brt_{iG_{1...J}}-\frac{\sum_{G=1}^J(\overline{Brt_j})}{J}$$ 

with begin <br>

\begin{align}
Deviance & = Brt_{iG_{1...J}}-E(Brt_{iG_{1...J}}) \\
 & = Brt_{iG_{1...J}}-\frac{\sum_{G=1}^J(\overline{Brt_j})}{J} 
\end{align}
where $Brts_{iG_{1...J}}$ means the actual brightness of movie $i$ with genre tags of genre $1$ to $J$, $\overline{Brt_j}$ mean the average brightness of movie genre $j$.
**Covariate**
Release year<br>
Movie genre indicators: each movie has 22 genre indicators <br>
MPAA film rating (e.g., PG, PG-13, R)<br>






# Preliminary Result
**Step 1**
Linear regression for each genre indicator: if there is a significant difference in poster brightness for movies with that genre indicator variable as 1 and movies with that variable as 0. 

![title](Q3_plot1.jpeg)

The plot depicts the means of the two groups within each genra. Take "Action" as an example. The "No" group includes movies not containing "Action" as one of their tags, and the "Yes" group includes movies with the "Action" tag. Means that are significantly different between the two groups are labeled with a “*” sign. The brightness and the height of each bar in the plot reflect the mean brightness of that group: the higher or the brighter the bar, the higher the mean brightness of that group.

Similarly, static image captured in RStudio. Run above code to generate dynamically.

![title](Plots/Q3_plot2.jpeg)

This plot is a series of scatterplots presenting correlation between the IMDB rating and the absolute deviance. Each sub-plot is different in grouping the data points, and every sub-plot represents one genre. The definition of 0 and 1 is the same as the definition of "No" and "Yes above. 