# Day29 Scraping from IMDb with Selenium 1/2 
# 用Selenium爬取IMDb 1/2

先來看電影評分網站IMDb資料長相，抓取需要的資訊存起來，程式碼是參考自這篇[文章](https://medium.com/datainpoint/python-essentials-web-scraping-with-selenium-638175f839ee)。<br>
Take a look at how IMDb save the movie info. Get the info we want and save them down. [Code reference](https://medium.com/datainpoint/python-essentials-web-scraping-with-selenium-638175f839ee).
![Title](2901.JPG)

In [1]:
# 載入所需套件 Import the packages
from pyquery import PyQuery as pq
import pandas as pd

def get_movie_info(movie_url):
    """
    從特定電影連結頁面取得資訊 Get movie info from a certain IMDb url
    """
    d = pq(movie_url)
    movie_rating = float(d("strong span").text()) # 抓取電影評分
    movie_genre = [x.text() for x in d(".subtext a").items()] # 抓取電影類型
    movie_released_date = movie_genre.pop() # 抓取電影上映日期
    movie_poster = d(".poster img").attr('src') # 抓取電影海報網址
    movie_cast = [x.text() for x in d(".primary_photo+ td a").items()] # 抓取電影演員

    # 回傳電影資訊 return the movie info
    movie_info = {
        "Rating": movie_rating,
        "Released_Date": movie_released_date,
        "Genre": movie_genre,
        "Poster_Link": movie_poster,
        "Cast": movie_cast
    }
    return movie_info

# 抓一筆電影資料看看 get the info of a movie to have a look
the_dressmaker = get_movie_info("https://www.imdb.com/title/tt2910904/")
print(the_dressmaker)

{'Rating': 7.1, 'Released_Date': '8 January 2016 (Taiwan)', 'Genre': ['Comedy', 'Drama'], 'Poster_Link': 'https://m.media-amazon.com/images/M/MV5BMjA4MzAxNTc5OF5BMl5BanBnXkFtZTgwMjgzMDE4OTE@._V1_UX182_CR0,0,182,268_AL_.jpg', 'Cast': ['Kate Winslet', 'Judy Davis', 'Liam Hemsworth', 'Hugo Weaving', 'Julia Blake', 'Shane Bourne', 'Kerry Fox', 'Rebecca Gibney', 'Caroline Goodall', 'Gyton Grantley', 'Tracy Harvey', 'Sacha Horler', 'Shane Jacobson', 'Geneviève Lemon', 'James Mackay']}


![Title](2902.JPG)

In [2]:
# 存成資料框架看一下 transform the info we get into dataframe
df = pd.DataFrame.from_dict(the_dressmaker, orient='index')
df.transpose()

Unnamed: 0,Rating,Released_Date,Genre,Poster_Link,Cast
0,7.1,8 January 2016 (Taiwan),"[Comedy, Drama]",https://m.media-amazon.com/images/M/MV5BMjA4Mz...,"[Kate Winslet, Judy Davis, Liam Hemsworth, Hug..."


文中若有錯誤還望不吝指正，感激不盡。
Please let me know if there’s any mistake in this article. Thanks for reading.

Reference 參考資料：

[1] [透過操控瀏覽器擷取網站資料](https://medium.com/datainpoint/python-essentials-web-scraping-with-selenium-638175f839ee)

[2] [What version of Chrome do I have?](https://www.whatismybrowser.com/detect/what-version-of-chrome-do-i-have)

[3] [ChromeDriver - WebDriver for Chrome](http://chromedriver.chromium.org/downloads)

[4] [IMDb](https://www.imdb.com/)

[5] [Stack Overflow](https://stackoverflow.com/questions/47148872/webdrivers-executable-may-have-wrong-permissions-please-see-https-sites-goo)
