# 6교시 Spark JDBC to MongoDB

### 목차
* [1. MongoDB 접속 예제](#1.-MongoDB-접속-예제)
* [10. 참고자료](#10.-참고자료)


### 1. Spark JDBC 환경구성

#### 1.1. MongoDB 기동

> 터미널에서 몽고디비를 기동되어 있다고 가정합니다

In [1]:
from pyspark.sql import *
from pyspark.sql.functions import *
from pyspark.sql.types import *
from IPython.display import display, display_pretty, clear_output, JSON

spark = (
    SparkSession
    .builder
    .config("spark.sql.session.timeZone", "Asia/Seoul")
    .config('spark.mongodb.input.uri', 'mongodb://mongo/default.people')
    .config('spark.mongodb.output.uri', 'mongodb://mongo/default.people')
    .getOrCreate()
)

# 노트북에서 테이블 형태로 데이터 프레임 출력을 위한 설정을 합니다
spark.conf.set("spark.sql.repl.eagerEval.enabled", True) # display enabled
spark.conf.set("spark.sql.repl.eagerEval.truncate", 100) # display output columns size

# 공통 데이터 위치
home_jovyan = "/home/jovyan"
work_data = f"{home_jovyan}/work/data"
work_dir=!pwd
work_dir = work_dir[0]

# 로컬 환경 최적화
spark.conf.set("spark.sql.shuffle.partitions", 5) # the number of partitions to use when shuffling data for joins or aggregations.
spark.conf.set("spark.sql.streaming.forceDeleteTempCheckpointLocation", "true")
spark

In [2]:
people = spark.createDataFrame(
    [
        ("Bilbo Baggins",  50)
        , ("Gandalf", 1000)
        , ("Thorin", 195)
        , ("Balin", 178)
        , ("Kili", 77)
        , ("Dwalin", 169)
        , ("Oin", 167)
        , ("Gloin", 158)
        , ("Fili", 82)
        , ("Bombur", None)
    ], ["name", "age"]
)

people.write.format("mongo").mode("append").save()
people.show()

+-------------+----+
|         name| age|
+-------------+----+
|Bilbo Baggins|  50|
|      Gandalf|1000|
|       Thorin| 195|
|        Balin| 178|
|         Kili|  77|
|       Dwalin| 169|
|          Oin| 167|
|        Gloin| 158|
|         Fili|  82|
|       Bombur|NULL|
+-------------+----+



#### 1.2 MongoDB 접속
```bash
docker exec -it mongo mongo
> use default
> show tables
> db.people.findOne()
> db.people.find()
```

## 10. 참고자료

#### 1. https://docs.mongodb.com/spark-connector/current/python-api/#python-basics
#### 2. https://hub.docker.com/_/mongo
#### 3.  https://www.mongodb.com/products/compass