# [selenium](https://selenium-python.readthedocs.io/)

#### [selenium(chinese)](https://selenium-python-zh.readthedocs.io/en/latest/index.html)
#### [Doc](https://selenium.dev/selenium/docs/api/py/api.html)
#### [Chrome driver download](https://sites.google.com/a/chromium.org/chromedriver/downloads)
* 選擇chrome版本
* 放入特定資料夾 (`/home/sppool/.local/share/chromedriver`)

In [1]:
import time
import requests # 讀取網頁資料
import bs4
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

In [2]:
# 打開瀏覽器(檔案位置)
chrome = webdriver.Chrome('/home/sppool/.local/share/chromedriver')

### [Driver Operat](https://selenium.dev/selenium/docs/api/py/webdriver_remote/selenium.webdriver.remote.webdriver.html)

|method|type|/|
|:---|---|---:|
|`.page_source`|(str)|網站原始碼|
|`.title`|(str)|網頁 title|
|`.current_url`|(str)|網址|
|`.implicitly_wait(s)`|(sec)|在抓取的資料前尚未出現時會等待|
|`.fullscreen_window()`|X|全視窗|
|`.save_screenshot(file_name)`|X|視窗截圖|
|`.find_element()`|X|尋找tag 元素|
|`.find_elements()`|X|尋找tags 元素|
|`.get(url)`|X|開啟網頁|
|`.back()`|X|上一頁|
|`.forward()`|X|下一頁|
|`.refresh()`|X|重新整理F5|
|`.maximize_window()`|X|最大化|
|`.minimize_window()`|X|最小化|
|`.fullscreen_window()`|X|全螢幕|
|`.close()`|X|關閉一個標籤|
|`.quit()`|X|關閉整個瀏覽器
|`.execute_script()`|X|執行 java script|


#### 等待
* `time.sleep(s)` 強制等待
* `driver.implicitly_wait(s)` 隱性等待, 適合放在最上方, 有資料也不會強制等待
    * 等待的是完整網頁, 不包括要觸法的js系列
* [顯性等待](https://stackoverflow.max-everyday.com/2019/01/python-selenium-wait/)

In [3]:
url = "http://www.python.org"
chrome.get(url)

# assert "Python" in chrome.title, 'error'

### [選取元素 Driver.find_element](https://selenium-python.readthedocs.io/locating-elements.html)

* `.find_element()`
    * find_element_by_id (屬性)
    * find_element_by_name (屬性)
    * find_element_by_xpath
    * find_element_by_link_text (`<a>` 超連結的 text)
    * find_element_by_partial_link_text (`<a>` 超連結的 text 部份符合即可)
    * find_element_by_tag_name (tag)
    * find_element_by_class_name (屬性)
    * find_element_by_css_selector
    
*  `.find_elements()`
    * find_elements_by_name
    * find_elements_by_xpath
    * find_elements_by_link_text
    * find_elements_by_partial_link_text
    * find_elements_by_tag_name
    * find_elements_by_class_name
    * find_elements_by_css_selector
    
#### `by_css_selector`    
 
|||
|:---|---:|
|`by_css_selector('tag')` |普通標籤|
|`by_css_selector('#id')` |id 有唯一性|
|`by_css_selector('.class')` |搜尋class|
|`by_css_selector('[attributes]')` |搜尋屬性|
|組合搜尋||
|`by_css_selector('a#id')` |Tag_a 中的 id ... 依此類推|
|`by_css_selector('body a ')` |body( 空白 )a ... 依此類推|
|`by_css_selector('xxx', 'yyy')` |多條件搜尋|

### [元素操作 element Operat](https://selenium.dev/selenium/docs/api/py/webdriver_remote/selenium.webdriver.remote.webelement.html)
|||
|:---|---:|
|`.send_keys('Keys.鍵盤按鍵', 多值)`|鍵盤輸入|
|`.clear()`|清空|
|`.chick()`|滑鼠右鍵點擊|
|`.text`|元素內文本|
|`.location`|元素位置|
|`.size`|元素大小|
|`.rect`|元素位置大小|
|`.id`| id 不是屬性中的id|
|`.find_element()`|尋找元素中的元素|
|`.get_attribute('att_name')`|元素的屬性class, id... ('outerHTML')全部的內容|
|`.get_property('att_name')`|元素的屬性class, id... (好像同上)|
|`.screenshot(filename)`|專取元素的截圖|



* `Keys` from `selenium.webdriver.common.keys.Keys.鍵盤按鍵`

### element.get_attribute('outerHTML')

* 抓到的 tag 原始碼

In [4]:
elem = chrome.find_element_by_class_name('donate-button')
print(elem.text)
# 元素截圖
# elem.screenshot('donate.png')

Donate


In [5]:
"""
<input id="id-search-field" name="q" type="search" role="textbox" 
 class="search-field" placeholder="Search" value="" tabindex="1">
"""

# 抓到元素
elem = chrome.find_element_by_tag_name('input')
# 檢視抓到元素的 html source
print(elem.get_attribute('outerHTML'))
print(elem.get_attribute('id'))
print(elem.get_attribute('name'))

<input id="id-search-field" name="q" type="search" role="textbox" class="search-field" placeholder="Search" value="" tabindex="1">
id-search-field
q


In [6]:
elem.get_property('outerHTML')

'<input id="id-search-field" name="q" type="search" role="textbox" class="search-field" placeholder="Search" value="" tabindex="1">'

In [7]:
elem.clear() # 清空元素
elem.send_keys('pool', Keys.ADD)
# elem.screenshot('input.png')

In [8]:
elem = chrome.find_element_by_id('submit')
elem.click()

### [滾動視窗](https://selenium-python.readthedocs.io/faq.html#how-to-scroll-down-to-the-bottom-of-a-page)
* `window.scrollTo(0, document.body.scrollHeight);`滾到最下緣
* javascript: `window.scrollTo(x, y)`

In [9]:
chrome.execute_script("window.scrollTo(0, document.body.scrollHeight);")

#### 網站原始碼可以丟到soup裡做元素的搜尋

In [10]:
page = chrome.page_source

In [11]:
soup = bs4.BeautifulSoup(page, "html.parser")
print(soup.prettify())

<html class="js no-touch geolocation fontface generatedcontent svg formvalidation placeholder boxsizing no-retina" dir="ltr" lang="en" style="">
 <!--<![endif]-->
 <head>
  <meta charset="utf-8"/>
  <meta content="IE=edge" http-equiv="X-UA-Compatible"/>
  <link href="//ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js" rel="prefetch"/>
  <meta content="Python.org" name="application-name"/>
  <meta content="The official home of the Python Programming Language" name="msapplication-tooltip"/>
  <meta content="Python.org" name="apple-mobile-web-app-title"/>
  <meta content="yes" name="apple-mobile-web-app-capable"/>
  <meta content="black" name="apple-mobile-web-app-status-bar-style"/>
  <meta content="width=device-width, initial-scale=1.0" name="viewport"/>
  <meta content="True" name="HandheldFriendly"/>
  <meta content="telephone=no" name="format-detection"/>
  <meta content="on" http-equiv="cleartype"/>
  <meta content="false" http-equiv="imagetoolbar"/>
  <script async="" src="

In [12]:
chrome.quit()