#### selenium 모듈
 
 - 웹 어플리케이션 테스트를 위한 프레임워크 : 홈페이지 테스트용으로 사용
 - 사용자가 아닌 프로그램이 웹 브라우저를 제어할 수 있도록 지원
 - 웝 브라우저마다 클라이언트 프로그램(Web Driver)이 별도로 필요(웹 브라우저 <-> 프로그램간 통신 목적)
 - 크롤링보다는 웹을 제어하는 목적이 더 큼
 - pip install selenium으로 설치
 - web driver 설치(크롬 버전 확인) : https://chromedriver.chromium.org/downloads
 - 압축 해제한 후 chromedrive.exe를 적당한 경로로 이동
 - selenium 모듈 호출 후 설치한 web driver 경로를 지정
 
이벤트로 제어하기 : 브라우저를 직접 제어하기 때문에 직접 컨트롤하여 마우스 클릭, 키보드 입력 자바 스크립트 이벤트 처리 가능
 - 마우스 클릭 : click()
 - 키보드 입력 : send_keys()
 - 자바스크립트 삽입 : execute_script()
 - 입력 양식 전송 : submit()
 - 스크린샷 : screenshot(파일이름)
 - 글자 지움 : clear()
 - 뒤로 가기 : back()
 - 앞으로 가기 : forward()

#### 셀레니움 함수(총 15개, element와 elements두개임)
 - find_element_by_id: id속성을 사용하여 접근
 - find_element(s)_by_class_name : 클래스를 사용하여 접근
 - find_element(s)_by_name : name 속성을 사용하여 접근
 - find_element(s)_by_xpath : xpath 속성을 사용하여 접근
 - find_element(s)_by_link_text : 엥커태그(a)에 사용되는 텍스트로 접근
 - find_element(s)_by_partial_link_text : 엥커태그(a)에 사용되는 일부 텍스트로 접근
 - find_element(s)_by_tag_name : 태그를 사용하여 접근
 - find_element(s)_by_css_selector : CSS 선택자를 사용하여 접근

In [1]:
!pip install selenium



In [2]:
# 네이버 사이트 불러오기
import selenium
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import warnings
warnings.filterwarnings('ignore')

url = 'https://naver.com'
path = 'C:/tool/chromedriver.exe'
driver = webdriver.Chrome(path)
driver.get(url)

In [3]:
print(driver.current_url)

https://www.naver.com/


In [4]:
driver.close()

In [5]:
# 브라우저 창을 띄우지 않고 수행하는 방법
options = webdriver.ChromeOptions()
options.add_argument('headless')
driver = webdriver.Chrome('C:/tool/chromedriver.exe',options=options)
driver.get('https://google.com')
print(driver.current_url)
driver.close()

https://www.google.com/


In [6]:
# 브라우저 최소화/최대화
driver = webdriver.Chrome(path)
driver.get('https://google.com')
driver.maximize_window()

In [7]:
driver.minimize_window()

In [8]:
# implicit Waits(암묵적 대기) : 찾으려는 element가 로드 될 때 까지 지정한 시간만큼 대기 할 수 있도록 설정
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome(path)
driver.get('https://google.com')

try:
    element = WebDriverWait(driver,5).until(
    EC.presence_of_element_located((By.CLASS_NAME,'gLFyf'))
    )
finally:
    driver.quit()


In [9]:
from time import sleep
driver = webdriver.Chrome(path)
driver.get('https://naver.com')
search_box = driver.find_element_by_xpath('//*[@id="query"]')
search_box.send_keys('빅데이터')
search_box.send_keys(Keys.RETURN)

# 직접 들어가서 찾을 필요없이 출력함
elements = driver.find_elements_by_xpath('//*[@id="main_pack"]/section/div/div/div/div/h3/a')

for element in elements:
    print(element.text)
    print(element.text, file=open('dataset/test_set.txt','a',encoding='utf-8'))
sleep(3)                # 직접 찾은것을 저장하는 방법
driver.close()

# //*[@id="main_pack"]/section[2]/div/div[2]/div[1]/div[1]/h3/a
# //*[@id="main_pack"]/section[2]/div/div[2]/div[3]/div[1]/h3/a

빅데이터란
빅데이터
빅데이터


In [10]:
import time
driver = webdriver.Chrome(path)
driver.get('https://naver.com')
driver.maximize_window()
driver.find_element_by_class_name('link_login').click()
time.sleep(1)
driver.back()
driver.forward()
driver.refresh()
time.sleep(2)
driver.back()
elem = driver.find_element_by_id('query')
elem.send_keys(Keys.ENTER)
els = driver.find_elements_by_tag_name('a')
for e in els:
    print(e.get_attribute('href'))
driver.close()

https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#lnb
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#content
https://www.naver.com/
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#
https://help.naver.com/support/alias/search/word/word_29.naver
https://help.naver.com/support/alias/search/word/word_29.naver
https://help.naver.com/support/alias/search/word/word_29.naver
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#
http

https://help.naver.com/support/alias/search/word/word_30.naver
https://m.news.naver.com/covid19/index.nhn
https://search.naver.com/search.naver?where=nexearch&query=%EC%BD%94%EB%A1%9C%EB%82%9819+%EC%84%A0%EB%B3%84%EC%A7%84%EB%A3%8C%EC%86%8C&sm=tab_etc
https://news.naver.com/main/factcheck/main.nhn?section=%C4%DA%B7%CE%B3%AA%B9%E9%BD%C5
https://search.naver.com/search.naver?where=nexearch&query=%EC%9A%B0%EB%A6%AC%EB%8F%99%EB%84%A4+%EB%B0%B1%EC%8B%A0%EC%95%8C%EB%A6%BC&sm=tab_etc
https://search.naver.com/search.naver?where=nexearch&query=%EC%BD%94%EB%A1%9C%EB%82%9819+%EB%B3%91%EC%83%81%EA%B0%80%EB%8F%99%EB%A5%A0&sm=tab_etc
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#
https://www.naver.com/more.html
https://policy.naver.com/policy/service.html
https://policy.naver.com/policy/privacy.html
https://help.naver.com/support/alias/search/integration/integration_1.naver
https://www.navercorp.com/


In [11]:
# 실행하여 연습해봐야 된다....(실행해보자)
import time
driver = webdriver.Chrome(path)
driver.get('https://naver.com')
driver.maximize_window()
driver.find_element_by_class_name('link_login').click()
time.sleep(1)
driver.back()
elem = driver.find_element_by_id('query')
elem.send_keys(Keys.ENTER)
els = driver.find_elements_by_tag_name('a')
for i,e in enumerate(els):
    if i < 5:
        print(e.get_attribute('href'))
    else:
        pass
driver.refresh()
driver.get('http://daum.net')
driver.close()

https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#lnb
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#content
https://www.naver.com/
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#
https://search.naver.com/search.naver?where=nexearch&sm=top_hty&fbm=0&ie=utf8&query=#


In [12]:
import time
driver = webdriver.Chrome(path)
driver.get('https://naver.com')
driver.maximize_window()

In [13]:
# Q. 다음사이트에서 빅데이터 관련 검색하여 출력한 후 구글 페이지로 이동


In [14]:
from time import sleep
driver = webdriver.Chrome(path)
driver.get('https://daum.net')
driver.maximize_window()
time.sleep(2)
search_box = driver.find_element_by_xpath('//*[@id="q"]')
search_box.send_keys('빅데이터')
search_box.send_keys(Keys.RETURN)
driver.refresh()
time.sleep(2)
driver.get('http://google.com')
search_box = driver.find_element_by_xpath('/html/body/div[1]/div[3]/form/div[1]/div[1]/div[1]/div/div[2]/input')
search_box.send_keys('빅데이터')
search_box.send_keys(Keys.RETURN)
time.sleep(2)
driver.close()

In [15]:
import time
driver = webdriver.Chrome(path)
driver.get('https://naver.com')
driver.maximize_window()
driver.find_element_by_css_selector('#NM_FAVORITE > div.group_nav > ul.list_nav.NM_FAVORITE_LIST > li:nth-child(2) > a').click()
elements = driver.find_elements_by_css_selector('#ct > div > section.main_content > div.main_brick > div > div> div > div > div > div> a > div> div')
for i, e in enumerate(elements):
    if i < 5:
        print(e.text)
    else:
        pass
driver.close()

머니투데이
내용작성전



이재명 민주당 대선 후보 제주 방문...도민 지지 호소


In [16]:
# [과제] 한빛미디어 사이트로 자동 로그인을 하여 들어가서 마일리지 점수 2000을 출력하세요(셀레니움 이용 브라우저 제방식으로 진행)

In [20]:
url = 'https://www.hanbit.co.kr/index.html'
path = 'C:/tool/chromedriver.exe'
driver = webdriver.Chrome(path)
driver.get(url)
driver.find_element_by_class_name('login').click()

id_box = driver.find_element_by_css_selector('#m_id')
pw_box = driver.find_element_by_id('m_passwd')
id_box.send_keys('')
pw_box.send_keys('')
driver.find_element_by_xpath('//*[@id="login_btn"]').click()
driver.find_element_by_xpath('//*[@id="wrap_nav"]/ul[2]/li[3]/a').click()

mileage_sections = driver.find_elements_by_xpath('//*[@id="container"]/div/div[2]/dl[1]')

for mileage_section in mileage_sections:
    print(mileage_section.text)

driver.refresh()
# driver.close()


마일리지
2,000 점


In [None]:
# [도전과제] 네이버 로그인 후 메일 리스트 출력하기

In [None]:
import time
import selenium
from selenium import webdriver

path = 'C:/tool/chromedriver.exe'
driver = webdriver.Chrome(path)
driver.get('https://www.naver.com')
driver.maximize_window()
time.sleep(1)
element = driver.find_element_by_class_name('link_login')
element.click()
id  = ''
pw = ''

# execute_script 함수를 사용하면 자바스크립트 코드를 실행
# js는 getElementById('id').value로 값을 가져올 수 있다.
driver.execute_script("document.getElementById('id').value=\'" + id + "\'") # text
time.sleep(1)
driver.execute_script("document.getElementById('pw').value=\'" + pw + "\'")
time.sleep(1)

element = driver.find_element_by_class_name('btn_login')
element.click()
print(driver.page_source)
