# 과제8 - 몽고DB에 화가 작품들 base64로 인코딩하여 저장
### 2018.07.20 고려대학교 조성표 
위키미디어의 르누아르 페이지를 Regex를 사용해 크롤링했습니다. 모두 원본 크기로 저장했으며, 그림 데이터베이스는 170여개의 그림에 1.1GB의 용량을 지닙니다. 

**In MongoDB, a database is not created until it gets content!**

MongoDB waits until you have created a collection (table), with at least one document (record) before it actually creates the database (and collection).

In [149]:
def add_pic(title, encoded_pic, artist = 'August Renoir'):
    new_pic = {'artist': artist,
               'title': title,
               'base64_encoded': encoded_pic}
    db.Renoir.insert_one(new_pic)

In [151]:
import re
import base64
import pymongo
import requests
from bs4 import BeautifulSoup
from pymongo import MongoClient

# Create Database and Collection
client = MongoClient('localhost', 27017)
db = client['renoir_gallery']
db.get_collection('Renoir')

html = requests.get('https://commons.wikimedia.org/wiki/Category:Paintings_by_Pierre-Auguste_Renoir')
soup = BeautifulSoup(html.text, 'lxml')

title_list = [title.get_text().replace('.jpg', '') for title in soup.select('div.gallerytext a')]
image_list = [link.get('src') for link in soup.select('li.gallerybox img')]

img_dict = dict(zip(title_list, image_list))

# Delete Images which does not ends with .jpg
for key in list(img_dict):
    if '.' in key: del img_dict[key]

# Change urls of img_dict values to image links
img_dict = dict((k, re.findall('(^https.+jpg)/.+px?',v)[0].replace('thumb/', '')) for (k,v) in img_dict.items())

for img_title, img_url in img_dict.items():
    req = requests.get(img_url)
    encoded = base64.b64encode(req.content)
    
    add_pic(img_title, encoded)
    print('{} Successfully encoded!'.format(img_title))

'Au Bord de l'Eau' by Pierre-Auguste Renoir, 1885 Successfully encoded!
Cópia de Saules et personnages dans une barque Successfully encoded!
Pierre-Auguste Renoir - Chestnut Tree in Bloom Successfully encoded!
1887, Renoir, River Landscape Successfully encoded!
Pierre Auguste Renoir Noirmoutier Successfully encoded!
Claude Terrasse, by Pierre-Auguste Renoir Successfully encoded!
1903 Renoir Blick aufs Meer Successfully encoded!
A Road in Louveciennes MET DP265242 Successfully encoded!
A Waitress at Duval's Restaurant MET DT1488 Successfully encoded!
A Young Girl with Daisies MET DT2161 Successfully encoded!
Auguste Renoir - Field of Banana Trees - Google Art Project Successfully encoded!
Auguste Renoir - Maternity - Google Art Project Successfully encoded!
Auguste Renoir - Pont Neuf, Paris - Google Art Project Successfully encoded!
Auguste Renoir - Snow-covered Landscape - Google Art Project Successfully encoded!
Auguste Renoir - The Mosque - Google Art Project Successfully encoded!
Au

Pierre-Auguste Renoir - Children on the Seashore, Guernsey (Enfants au bord de la mer à Guernesey) - BF10 - Barnes Foundation Successfully encoded!
Pierre-Auguste Renoir - Claude Renoir - BF935 - Barnes Foundation Successfully encoded!
Pierre-Auguste Renoir - Composition, Five Bathers (Composition, cinq baigneuses) - BF902 - Barnes Foundation Successfully encoded!
Pierre-Auguste Renoir - Cup of Chocolate (La Tasse de chocolat) - BF40 - Barnes Foundation Successfully encoded!
Pierre-Auguste Renoir - Dovecote at Bellevue (Pigeonnier de Bellevue) - BF969 - Barnes Foundation Successfully encoded!
Pierre-Auguste Renoir - Embroiderers (Les Brodeuses) - BF239 - Barnes Foundation Successfully encoded!
Pierre-Auguste Renoir - Environs of Berneval - BF6 - Barnes Foundation Successfully encoded!
Pierre-Auguste Renoir - Environs of Briey (Environs de Briey) - BF151 - Barnes Foundation Successfully encoded!
Pierre-Auguste Renoir - Farmhouse (La Ferme) - BF47 - Barnes Foundation Successfully encoded

Pierre-Auguste Renoir - Peninsula of Saint-Jean (Presqu'île de Saint-Jean) - BF240 - Barnes Foundation Successfully encoded!
Pierre-Auguste Renoir - Picnic (Le Déjeuner sur l'herbe) - BF567 - Barnes Foundation Successfully encoded!
