# Chapter 8: Extracting Data from the Internet
## 04: 用 ElementTree 在 Python 中解析 weather data in XML format 

2020-12-13, 2019-05-05

###  參考「用 ElementTree 在 Python 中解析 XML」

https://pycoders-weekly-chinese.readthedocs.io/en/latest/issue6/processing-xml-in-python-with-element-tree.html

### 应该使用哪个 XML 库？¶
Python 有非常非常多的工具来处理 XML。在这个部分我想对 Python 所提供的包进行一个简单的浏览，并且解释为什么 ElementTree 是你最应该用的那一个。

xml.dom.* 模块 － 是 W3C DOM API 的实现。如果你有处理 DOM API 的需要，那么这个模块适合你。注意：在 xml.dom 包里面有许多模块，注意它们之间的不同。

xml.sax.* 模块 － 是 SAX API 的实现。这个模块牺牲了便捷性来换取速度和内存占用。SAX 是一个基于事件的 API，这就意味着它可以“在空中”(on the fly)处理庞大数量的的文档，不用完全加载进内存(见注释1)。

xml.parser.expat － 是一个直接的，低级一点的基于 C 的 expat 的语法分析器(见注释2)。 expat 接口基于事件反馈，有点像 SAX 但又不太像，因为它的接口并不是完全规范于 expat 库的。

最后，**我们来看看 xml.etree.ElementTree (以下简称 ET)**S。它提供了轻量级的 Python 式的 API ，它由一个 C 实现来提供。相对于 DOM 来说，ET 快了很多(见注释3)而且有很多令人愉悦的 API 可以使用。相对于 SAX 来说，ET 也有 ET.iterparse 提供了 “在空中” 的处理方式，没有必要加载整个文档到内存。ET 的性能的平均值和 SAX 差不多，但是 API 的效率更高一点而且使用起来很方便。我一会儿会给你们看演示。

In [1]:
import xml.etree.ElementTree as ET

## xml.etree.ElementTree — The ElementTree XML API
https://docs.python.org/3/library/xml.etree.elementtree.html

In [2]:
# 多行資料的輸入
countryXML='''<?xml version="1.0"?>
<data>
    <country name="Liechtenstein"> <!-- 列支敦士登 -->
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>'''

### 比較好看的呈現法：
```XML
<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>
```

In [3]:
# 整棵樹的頭就是 data 為標籤的 root 節點
countryTree = ET.fromstring(countryXML)

In [4]:
type(countryTree)

xml.etree.ElementTree.Element

![XML介紹](http://www.ukoln.ac.uk/metadata/dcmi/dc-elem-prop/image/Slide1.png)

In [5]:
countryTree.tag, countryTree.attrib, countryTree.text

('data', {}, '\n    ')

In [6]:
# 每個 node 都可看成是一個內含若干 children 的 list
for child in countryTree:
    print(child.tag, child.attrib)

country {'name': 'Liechtenstein'}
country {'name': 'Singapore'}
country {'name': 'Panama'}


In [7]:
for child in countryTree:
    print(child.tag, child.attrib, child.text)

country {'name': 'Liechtenstein'}  
        
country {'name': 'Singapore'} 
        
country {'name': 'Panama'} 
        


In [8]:
help(countryTree)

Help on Element object:

class Element(builtins.object)
 |  Methods defined here:
 |  
 |  __copy__(self, /)
 |  
 |  __deepcopy__(self, memo, /)
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(self, key, /)
 |      Return self[key].
 |  
 |  __getstate__(self, /)
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __len__(self, /)
 |      Return len(self).
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  __setitem__(self, key, value, /)
 |      Set self[key] to value.
 |  
 |  __setstate__(self, state, /)
 |  
 |  __sizeof__(self, /)
 |      Size of object in memory, in bytes.
 |  
 |  append(self, subelement, /)
 |  
 |  clear(self, /)
 |  
 |  extend(self, elements, /)
 |  
 |  find(self, /, path, namespaces=None)
 |  
 |  findall(self, /, path, namespaces=None)
 |  
 |  findtext(self, /, 

In [9]:
# 這個 element 中所有有 "neighbor" tag 的 lelement
for neighbor in countryTree.iter('neighbor'):
    print(neighbor.tag, neighbor.attrib)

neighbor {'name': 'Austria', 'direction': 'E'}
neighbor {'name': 'Switzerland', 'direction': 'W'}
neighbor {'name': 'Malaysia', 'direction': 'N'}
neighbor {'name': 'Costa Rica', 'direction': 'W'}
neighbor {'name': 'Colombia', 'direction': 'E'}


In [13]:
# 這個 element 中所有有 "neighbor" tag 的 lelement
for neighbor in countryTree.iter('neighbor'):
    print(type(neighbor.attrib))

<class 'dict'>
<class 'dict'>
<class 'dict'>
<class 'dict'>
<class 'dict'>


In [10]:
# 直接就取 attribute 中對應 name 的 value
for neighbor in countryTree.iter('neighbor'):
    print(neighbor.get("name"))

Austria
Switzerland
Malaysia
Costa Rica
Colombia


In [14]:
countryTree

<Element 'data' at 0x0000021EE016D098>

In [15]:
# 將 element 印出來
ET.dump(countryTree)

<data>
    <country name="Liechtenstein"> 
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor direction="N" name="Malaysia" />
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor direction="W" name="Costa Rica" />
        <neighbor direction="E" name="Colombia" />
    </country>
</data>


In [16]:
# 找到以 country 為 tag 的『第一筆』資料，為一個 element，而不是 lisst
countryTree.find("country")  

<Element 'country' at 0x0000021EE016D048>

In [17]:
ET.dump(countryTree.find("country") )

<country name="Liechtenstein"> 
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
    


In [18]:
# 在整個 element 中找出有 tag 為 'country' 的 element
# 結果為一個　list
countryTree.findall('country')

[<Element 'country' at 0x0000021EE016D048>,
 <Element 'country' at 0x0000021EE016D548>,
 <Element 'country' at 0x0000021EE016D6D8>]

In [19]:
type(countryTree.findall('country'))

list

In [20]:
# 找到以 country 為 tag 的資料
countryTree.find("country").text  

' \n        '

In [21]:
# 找到以 country 為 tag 的資料
countryTree.find("country").attrib

{'name': 'Liechtenstein'}

In [22]:
# 其實它就是一個 python 的 dictionary
type(countryTree.find("country").attrib)

dict

In [23]:
# 找到以 country 為 tag 且 attribute 為 "name" 的 value
countryTree.find('country').get("name")  

'Liechtenstein'

In [24]:
for country in countryTree.findall('country'):
    rank = country.find('rank').text
    name = country.get('name')
    print(name, rank)

Liechtenstein 1
Singapore 4
Panama 68


## 我們來分析一下從 openweathermap 拿下來的 xml 檔

In [25]:
import xml.etree.ElementTree as ET
import urllib
url = 'http://api.openweathermap.org/data/2.5/forecast?q=Taipei,TW&mode=xml&appid=d1deefb25fb63cf70eea21a43dad94f7'

### XML viewer
https://codebeautify.org/xmlviewer

http://api.openweathermap.org/data/2.5/forecast?q=Taipei,TW&mode=xml&appid=d1deefb25fb63cf70eea21a43dad94f7

In [26]:
response = urllib.request.urlopen(url)
xml_response = response.read()
# xml_response

In [27]:
type(xml_response)

bytes

In [28]:
weatherXML = ET.fromstring(xml_response)

In [29]:
type(weatherXML)

xml.etree.ElementTree.Element

In [30]:
ET.dump(weatherXML)

<weatherdata><location><name>Taipei</name><type /><country>TW</country><timezone>28800</timezone><location altitude="0" geobase="geonames" geobaseid="1668341" latitude="25.0478" longitude="121.5319" /></location><credit /><meta><lastupdate /><calctime>0</calctime><nextupdate /></meta><sun rise="2020-12-13T22:30:49" set="2020-12-14T09:06:26" /><forecast><time from="2020-12-14T00:00:00" to="2020-12-14T03:00:00"><symbol name="light rain" number="500" var="10d" /><precipitation probability="0.72" type="rain" unit="3h" value="0.64" /><windDirection code="NE" deg="54" name="NorthEast" /><windSpeed mps="5.46" name="Gentle Breeze" unit="m/s" /><temperature max="290.33" min="289.88" unit="kelvin" value="290.33" /><feels_like unit="kelvin" value="287.67" /><pressure unit="hPa" value="1021" /><humidity unit="%" value="80" /><clouds all="87" unit="%" value="overcast clouds" /><visibility value="10000" /></time><time from="2020-12-14T03:00:00" to="2020-12-14T06:00:00"><symbol name="light rain" numb

In [31]:
# 其實沒有更好看 → XML viewer 或是用 beautifulsoup 來看
import pprint
pprint.pprint(ET.dump(weatherXML))

<weatherdata><location><name>Taipei</name><type /><country>TW</country><timezone>28800</timezone><location altitude="0" geobase="geonames" geobaseid="1668341" latitude="25.0478" longitude="121.5319" /></location><credit /><meta><lastupdate /><calctime>0</calctime><nextupdate /></meta><sun rise="2020-12-13T22:30:49" set="2020-12-14T09:06:26" /><forecast><time from="2020-12-14T00:00:00" to="2020-12-14T03:00:00"><symbol name="light rain" number="500" var="10d" /><precipitation probability="0.72" type="rain" unit="3h" value="0.64" /><windDirection code="NE" deg="54" name="NorthEast" /><windSpeed mps="5.46" name="Gentle Breeze" unit="m/s" /><temperature max="290.33" min="289.88" unit="kelvin" value="290.33" /><feels_like unit="kelvin" value="287.67" /><pressure unit="hPa" value="1021" /><humidity unit="%" value="80" /><clouds all="87" unit="%" value="overcast clouds" /><visibility value="10000" /></time><time from="2020-12-14T03:00:00" to="2020-12-14T06:00:00"><symbol name="light rain" numb

In [32]:
print(weatherXML)

<Element 'weatherdata' at 0x0000021EE01EE138>


## xml.etree.ElementTree — The ElementTree XML API
https://docs.python.org/3/library/xml.etree.elementtree.html

In [33]:
# 這個 element 可以直接算裏面的元素個數
len(weatherXML)

5

In [34]:
for child in weatherXML:
    print(child.tag, child.attrib)

location {}
credit {}
meta {}
sun {'rise': '2020-12-13T22:30:49', 'set': '2020-12-14T09:06:26'}
forecast {}


In [35]:
help(weatherXML.iter())

Help on _element_iterator object:

class _element_iterator(builtins.object)
 |  Methods defined here:
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __next__(self, /)
 |      Implement next(self).



### Python Iterators
An iterator is an object that contains a countable number of values.  \
An iterator is an object that can be iterated upon, meaning that you can traverse through all the values. \
Technically, in Python, an iterator is an object which implements the iterator protocol, which consist of the methods __iter__() and __next__().

In [36]:
# 把整個 weatherXML 整棵樹的 tag 按 deep-first search 的方式走一遍
for elem in weatherXML.iter():
    print(elem.tag, elem.attrib)

weatherdata {}
location {}
name {}
type {}
country {}
timezone {}
location {'altitude': '0', 'latitude': '25.0478', 'longitude': '121.5319', 'geobase': 'geonames', 'geobaseid': '1668341'}
credit {}
meta {}
lastupdate {}
calctime {}
nextupdate {}
sun {'rise': '2020-12-13T22:30:49', 'set': '2020-12-14T09:06:26'}
forecast {}
time {'from': '2020-12-14T00:00:00', 'to': '2020-12-14T03:00:00'}
symbol {'number': '500', 'name': 'light rain', 'var': '10d'}
precipitation {'probability': '0.72', 'unit': '3h', 'value': '0.64', 'type': 'rain'}
windDirection {'deg': '54', 'code': 'NE', 'name': 'NorthEast'}
windSpeed {'mps': '5.46', 'unit': 'm/s', 'name': 'Gentle Breeze'}
temperature {'unit': 'kelvin', 'value': '290.33', 'min': '289.88', 'max': '290.33'}
feels_like {'value': '287.67', 'unit': 'kelvin'}
pressure {'unit': 'hPa', 'value': '1021'}
humidity {'value': '80', 'unit': '%'}
clouds {'value': 'overcast clouds', 'all': '87', 'unit': '%'}
visibility {'value': '10000'}
time {'from': '2020-12-14T03:0

In [37]:
# 可以定 tag = "temperature" 的條件
for elem in weatherXML.iter("temperature"):
    print(elem.tag, elem.attrib)

temperature {'unit': 'kelvin', 'value': '290.33', 'min': '289.88', 'max': '290.33'}
temperature {'unit': 'kelvin', 'value': '289.57', 'min': '289.24', 'max': '289.57'}
temperature {'unit': 'kelvin', 'value': '289.17', 'min': '289.06', 'max': '289.17'}
temperature {'unit': 'kelvin', 'value': '288.46', 'min': '288.44', 'max': '288.46'}
temperature {'unit': 'kelvin', 'value': '288.83', 'min': '288.83', 'max': '288.83'}
temperature {'unit': 'kelvin', 'value': '288.79', 'min': '288.79', 'max': '288.79'}
temperature {'unit': 'kelvin', 'value': '288.74', 'min': '288.74', 'max': '288.74'}
temperature {'unit': 'kelvin', 'value': '288.89', 'min': '288.89', 'max': '288.89'}
temperature {'unit': 'kelvin', 'value': '289.45', 'min': '289.45', 'max': '289.45'}
temperature {'unit': 'kelvin', 'value': '289.35', 'min': '289.35', 'max': '289.35'}
temperature {'unit': 'kelvin', 'value': '289.39', 'min': '289.39', 'max': '289.39'}
temperature {'unit': 'kelvin', 'value': '289.5', 'min': '289.5', 'max': '289

In [38]:
# 可以定 tag = "temperature" 的條件
# 利用 enumerate() 可以補上第幾個的資訊
for index, elem in enumerate(weatherXML.iter("temperature"), start = 1):
    print(index, elem.tag, elem.attrib)

1 temperature {'unit': 'kelvin', 'value': '290.33', 'min': '289.88', 'max': '290.33'}
2 temperature {'unit': 'kelvin', 'value': '289.57', 'min': '289.24', 'max': '289.57'}
3 temperature {'unit': 'kelvin', 'value': '289.17', 'min': '289.06', 'max': '289.17'}
4 temperature {'unit': 'kelvin', 'value': '288.46', 'min': '288.44', 'max': '288.46'}
5 temperature {'unit': 'kelvin', 'value': '288.83', 'min': '288.83', 'max': '288.83'}
6 temperature {'unit': 'kelvin', 'value': '288.79', 'min': '288.79', 'max': '288.79'}
7 temperature {'unit': 'kelvin', 'value': '288.74', 'min': '288.74', 'max': '288.74'}
8 temperature {'unit': 'kelvin', 'value': '288.89', 'min': '288.89', 'max': '288.89'}
9 temperature {'unit': 'kelvin', 'value': '289.45', 'min': '289.45', 'max': '289.45'}
10 temperature {'unit': 'kelvin', 'value': '289.35', 'min': '289.35', 'max': '289.35'}
11 temperature {'unit': 'kelvin', 'value': '289.39', 'min': '289.39', 'max': '289.39'}
12 temperature {'unit': 'kelvin', 'value': '289.5', 

In [39]:
# 可以定 tag = "temperature" 的條件
# 直接取出我們要的 value 的值
for index, elem in enumerate(weatherXML.iter("temperature")):
    print(index, elem.get("value"))

0 290.33
1 289.57
2 289.17
3 288.46
4 288.83
5 288.79
6 288.74
7 288.89
8 289.45
9 289.35
10 289.39
11 289.5
12 289.49
13 289.28
14 289.06
15 288.96
16 289.26
17 289.56
18 289.16
19 289.05
20 288.94
21 288.85
22 288.96
23 289.16
24 289.34
25 289.28
26 289.46
27 289.86
28 290.13
29 290.38
30 290.69
31 290.94
32 291.76
33 292.77
34 291.82
35 291.79
36 291.68
37 291.73
38 291.51
39 291.67


In [40]:
# 真正 5 天共 40 筆的資料存在 'forecast' 裏
weatherXML[4]

<Element 'forecast' at 0x0000021EE01EE548>

In [41]:
for elem in weatherXML[4]:
    print(elem.tag, elem.attrib)

time {'from': '2020-12-14T00:00:00', 'to': '2020-12-14T03:00:00'}
time {'from': '2020-12-14T03:00:00', 'to': '2020-12-14T06:00:00'}
time {'from': '2020-12-14T06:00:00', 'to': '2020-12-14T09:00:00'}
time {'from': '2020-12-14T09:00:00', 'to': '2020-12-14T12:00:00'}
time {'from': '2020-12-14T12:00:00', 'to': '2020-12-14T15:00:00'}
time {'from': '2020-12-14T15:00:00', 'to': '2020-12-14T18:00:00'}
time {'from': '2020-12-14T18:00:00', 'to': '2020-12-14T21:00:00'}
time {'from': '2020-12-14T21:00:00', 'to': '2020-12-15T00:00:00'}
time {'from': '2020-12-15T00:00:00', 'to': '2020-12-15T03:00:00'}
time {'from': '2020-12-15T03:00:00', 'to': '2020-12-15T06:00:00'}
time {'from': '2020-12-15T06:00:00', 'to': '2020-12-15T09:00:00'}
time {'from': '2020-12-15T09:00:00', 'to': '2020-12-15T12:00:00'}
time {'from': '2020-12-15T12:00:00', 'to': '2020-12-15T15:00:00'}
time {'from': '2020-12-15T15:00:00', 'to': '2020-12-15T18:00:00'}
time {'from': '2020-12-15T18:00:00', 'to': '2020-12-15T21:00:00'}
time {'fro

In [42]:
# 這和我一開始就從整棵樹取的 "time" tag 的 element 效果一樣
for elem in weatherXML.iter("time"):
    print(elem.tag, elem.attrib)

time {'from': '2020-12-14T00:00:00', 'to': '2020-12-14T03:00:00'}
time {'from': '2020-12-14T03:00:00', 'to': '2020-12-14T06:00:00'}
time {'from': '2020-12-14T06:00:00', 'to': '2020-12-14T09:00:00'}
time {'from': '2020-12-14T09:00:00', 'to': '2020-12-14T12:00:00'}
time {'from': '2020-12-14T12:00:00', 'to': '2020-12-14T15:00:00'}
time {'from': '2020-12-14T15:00:00', 'to': '2020-12-14T18:00:00'}
time {'from': '2020-12-14T18:00:00', 'to': '2020-12-14T21:00:00'}
time {'from': '2020-12-14T21:00:00', 'to': '2020-12-15T00:00:00'}
time {'from': '2020-12-15T00:00:00', 'to': '2020-12-15T03:00:00'}
time {'from': '2020-12-15T03:00:00', 'to': '2020-12-15T06:00:00'}
time {'from': '2020-12-15T06:00:00', 'to': '2020-12-15T09:00:00'}
time {'from': '2020-12-15T09:00:00', 'to': '2020-12-15T12:00:00'}
time {'from': '2020-12-15T12:00:00', 'to': '2020-12-15T15:00:00'}
time {'from': '2020-12-15T15:00:00', 'to': '2020-12-15T18:00:00'}
time {'from': '2020-12-15T18:00:00', 'to': '2020-12-15T21:00:00'}
time {'fro

In [43]:
 weatherXML[4].tag,  weatherXML[4].attrib

('forecast', {})

In [44]:
# 這是 40 筆的 time 中的 [0] 筆
ET.dump(weatherXML[4][0])

<time from="2020-12-14T00:00:00" to="2020-12-14T03:00:00"><symbol name="light rain" number="500" var="10d" /><precipitation probability="0.72" type="rain" unit="3h" value="0.64" /><windDirection code="NE" deg="54" name="NorthEast" /><windSpeed mps="5.46" name="Gentle Breeze" unit="m/s" /><temperature max="290.33" min="289.88" unit="kelvin" value="290.33" /><feels_like unit="kelvin" value="287.67" /><pressure unit="hPa" value="1021" /><humidity unit="%" value="80" /><clouds all="87" unit="%" value="overcast clouds" /><visibility value="10000" /></time>


In [45]:
weatherXML[4][0]

<Element 'time' at 0x0000021EE01EE598>

In [46]:
help(weatherXML[4][0])

Help on Element object:

class Element(builtins.object)
 |  Methods defined here:
 |  
 |  __copy__(self, /)
 |  
 |  __deepcopy__(self, memo, /)
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(self, key, /)
 |      Return self[key].
 |  
 |  __getstate__(self, /)
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __len__(self, /)
 |      Return len(self).
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  __setitem__(self, key, value, /)
 |      Set self[key] to value.
 |  
 |  __setstate__(self, state, /)
 |  
 |  __sizeof__(self, /)
 |      Size of object in memory, in bytes.
 |  
 |  append(self, subelement, /)
 |  
 |  clear(self, /)
 |  
 |  extend(self, elements, /)
 |  
 |  find(self, /, path, namespaces=None)
 |  
 |  findall(self, /, path, namespaces=None)
 |  
 |  findtext(self, /, 

In [47]:
# 也可以有類似 directory 中 path 的概念：XPath 來取 element
for elem in weatherXML[4].iterfind('time/temperature'):
    print(elem.tag, elem.attrib["value"], elem.text)

temperature 290.33 None
temperature 289.57 None
temperature 289.17 None
temperature 288.46 None
temperature 288.83 None
temperature 288.79 None
temperature 288.74 None
temperature 288.89 None
temperature 289.45 None
temperature 289.35 None
temperature 289.39 None
temperature 289.5 None
temperature 289.49 None
temperature 289.28 None
temperature 289.06 None
temperature 288.96 None
temperature 289.26 None
temperature 289.56 None
temperature 289.16 None
temperature 289.05 None
temperature 288.94 None
temperature 288.85 None
temperature 288.96 None
temperature 289.16 None
temperature 289.34 None
temperature 289.28 None
temperature 289.46 None
temperature 289.86 None
temperature 290.13 None
temperature 290.38 None
temperature 290.69 None
temperature 290.94 None
temperature 291.76 None
temperature 292.77 None
temperature 291.82 None
temperature 291.79 None
temperature 291.68 None
temperature 291.73 None
temperature 291.51 None
temperature 291.67 None


In [48]:
for elem in weatherXML[4]:
    print(elem.tag, elem.attrib, elem.text)

time {'from': '2020-12-14T00:00:00', 'to': '2020-12-14T03:00:00'} None
time {'from': '2020-12-14T03:00:00', 'to': '2020-12-14T06:00:00'} None
time {'from': '2020-12-14T06:00:00', 'to': '2020-12-14T09:00:00'} None
time {'from': '2020-12-14T09:00:00', 'to': '2020-12-14T12:00:00'} None
time {'from': '2020-12-14T12:00:00', 'to': '2020-12-14T15:00:00'} None
time {'from': '2020-12-14T15:00:00', 'to': '2020-12-14T18:00:00'} None
time {'from': '2020-12-14T18:00:00', 'to': '2020-12-14T21:00:00'} None
time {'from': '2020-12-14T21:00:00', 'to': '2020-12-15T00:00:00'} None
time {'from': '2020-12-15T00:00:00', 'to': '2020-12-15T03:00:00'} None
time {'from': '2020-12-15T03:00:00', 'to': '2020-12-15T06:00:00'} None
time {'from': '2020-12-15T06:00:00', 'to': '2020-12-15T09:00:00'} None
time {'from': '2020-12-15T09:00:00', 'to': '2020-12-15T12:00:00'} None
time {'from': '2020-12-15T12:00:00', 'to': '2020-12-15T15:00:00'} None
time {'from': '2020-12-15T15:00:00', 'to': '2020-12-15T18:00:00'} None
time {

In [49]:
# 每一個 3小時時段的預測內容：
for elem in weatherXML[4][0]:
    print(elem.tag, elem.attrib, elem.text)

symbol {'number': '500', 'name': 'light rain', 'var': '10d'} None
precipitation {'probability': '0.72', 'unit': '3h', 'value': '0.64', 'type': 'rain'} None
windDirection {'deg': '54', 'code': 'NE', 'name': 'NorthEast'} None
windSpeed {'mps': '5.46', 'unit': 'm/s', 'name': 'Gentle Breeze'} None
temperature {'unit': 'kelvin', 'value': '290.33', 'min': '289.88', 'max': '290.33'} None
feels_like {'value': '287.67', 'unit': 'kelvin'} None
pressure {'unit': 'hPa', 'value': '1021'} None
humidity {'value': '80', 'unit': '%'} None
clouds {'value': 'overcast clouds', 'all': '87', 'unit': '%'} None
visibility {'value': '10000'} None


In [50]:
for elem in weatherXML[4][1]:
    print(elem.tag, elem.attrib, elem.text)

symbol {'number': '500', 'name': 'light rain', 'var': '10d'} None
precipitation {'probability': '0.76', 'unit': '3h', 'value': '1.58', 'type': 'rain'} None
windDirection {'deg': '46', 'code': 'NE', 'name': 'NorthEast'} None
windSpeed {'mps': '5.64', 'unit': 'm/s', 'name': 'Moderate breeze'} None
temperature {'unit': 'kelvin', 'value': '289.57', 'min': '289.24', 'max': '289.57'} None
feels_like {'value': '286.54', 'unit': 'kelvin'} None
pressure {'unit': 'hPa', 'value': '1020'} None
humidity {'value': '80', 'unit': '%'} None
clouds {'value': 'overcast clouds', 'all': '95', 'unit': '%'} None
visibility {'value': '10000'} None


In [51]:
# weatherXML[4]
result = weatherXML.find("forecast")

In [52]:
len(result)

40

In [54]:
for item in weatherXML.find("forecast"):
    print(item.tag, item.attrib)

time {'from': '2020-12-14T00:00:00', 'to': '2020-12-14T03:00:00'}
time {'from': '2020-12-14T03:00:00', 'to': '2020-12-14T06:00:00'}
time {'from': '2020-12-14T06:00:00', 'to': '2020-12-14T09:00:00'}
time {'from': '2020-12-14T09:00:00', 'to': '2020-12-14T12:00:00'}
time {'from': '2020-12-14T12:00:00', 'to': '2020-12-14T15:00:00'}
time {'from': '2020-12-14T15:00:00', 'to': '2020-12-14T18:00:00'}
time {'from': '2020-12-14T18:00:00', 'to': '2020-12-14T21:00:00'}
time {'from': '2020-12-14T21:00:00', 'to': '2020-12-15T00:00:00'}
time {'from': '2020-12-15T00:00:00', 'to': '2020-12-15T03:00:00'}
time {'from': '2020-12-15T03:00:00', 'to': '2020-12-15T06:00:00'}
time {'from': '2020-12-15T06:00:00', 'to': '2020-12-15T09:00:00'}
time {'from': '2020-12-15T09:00:00', 'to': '2020-12-15T12:00:00'}
time {'from': '2020-12-15T12:00:00', 'to': '2020-12-15T15:00:00'}
time {'from': '2020-12-15T15:00:00', 'to': '2020-12-15T18:00:00'}
time {'from': '2020-12-15T18:00:00', 'to': '2020-12-15T21:00:00'}
time {'fro

In [55]:
weatherXML[4][1].find("symbol")

<Element 'symbol' at 0x0000021EE01EE958>

In [56]:
weatherXML.find("forecast")[1].find("symbol")

<Element 'symbol' at 0x0000021EE01EE958>

In [57]:
weatherXML.find("forecast")[1].find("symbol").get("name")

'light rain'

In [58]:
# findall() 回傳一個 list
weatherXML.findall("forecast")

[<Element 'forecast' at 0x0000021EE01EE548>]

In [59]:
for elem in weatherXML.iter("symbol"):
    print(elem.tag, elem.attrib)

symbol {'number': '500', 'name': 'light rain', 'var': '10d'}
symbol {'number': '500', 'name': 'light rain', 'var': '10d'}
symbol {'number': '500', 'name': 'light rain', 'var': '10d'}
symbol {'number': '500', 'name': 'light rain', 'var': '10n'}
symbol {'number': '500', 'name': 'light rain', 'var': '10n'}
symbol {'number': '500', 'name': 'light rain', 'var': '10n'}
symbol {'number': '500', 'name': 'light rain', 'var': '10n'}
symbol {'number': '500', 'name': 'light rain', 'var': '10d'}
symbol {'number': '500', 'name': 'light rain', 'var': '10d'}
symbol {'number': '500', 'name': 'light rain', 'var': '10d'}
symbol {'number': '500', 'name': 'light rain', 'var': '10d'}
symbol {'number': '500', 'name': 'light rain', 'var': '10n'}
symbol {'number': '500', 'name': 'light rain', 'var': '10n'}
symbol {'number': '500', 'name': 'light rain', 'var': '10n'}
symbol {'number': '501', 'name': 'moderate rain', 'var': '10n'}
symbol {'number': '500', 'name': 'light rain', 'var': '10d'}
symbol {'number': '50

In [63]:
for elem in weatherXML.iter("symbol"):
    print(elem.tag, elem.get('name'))

symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol moderate rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol light rain
symbol moderate rain
symbol light rain
symbol light rain


In [64]:
for elem in weatherXML[4].iter("symbol"):
    print(elem.tag, elem.attrib, elem.text)

symbol {'number': '500', 'name': 'light rain', 'var': '10d'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10d'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10d'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10n'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10n'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10n'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10n'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10d'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10d'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10d'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10d'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10n'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10n'} None
symbol {'number': '500', 'name': 'light rain', 'var': '10n'} None
symbol {'number': '501', 'name': 'moderate rain', 'var': '10n'} None
symbol 

In [65]:
weatherXML[4][1].find("symbol")

<Element 'symbol' at 0x0000021EE01EE958>

In [66]:
weatherXML[4][1].find("symbol").text

In [67]:
weatherXML[4][1].find("symbol").get("name")

'light rain'

In [68]:
weatherXML[4][1].tag

'time'

In [69]:
weatherXML[4].findall("time")

[<Element 'time' at 0x0000021EE01EE598>,
 <Element 'time' at 0x0000021EE01EE908>,
 <Element 'time' at 0x0000021EE01EEC78>,
 <Element 'time' at 0x0000021EE01F9048>,
 <Element 'time' at 0x0000021EE01F93B8>,
 <Element 'time' at 0x0000021EE01F9728>,
 <Element 'time' at 0x0000021EE01F9A98>,
 <Element 'time' at 0x0000021EE01F9E08>,
 <Element 'time' at 0x0000021EE01FE1D8>,
 <Element 'time' at 0x0000021EE01FE548>,
 <Element 'time' at 0x0000021EE01FE8B8>,
 <Element 'time' at 0x0000021EE01FEC28>,
 <Element 'time' at 0x0000021EE01FEF98>,
 <Element 'time' at 0x0000021EE0204368>,
 <Element 'time' at 0x0000021EE02046D8>,
 <Element 'time' at 0x0000021EE0204A48>,
 <Element 'time' at 0x0000021EE0204DB8>,
 <Element 'time' at 0x0000021EE020B188>,
 <Element 'time' at 0x0000021EE020B4A8>,
 <Element 'time' at 0x0000021EE020B818>,
 <Element 'time' at 0x0000021EE020BB88>,
 <Element 'time' at 0x0000021EE020BEF8>,
 <Element 'time' at 0x0000021EE02112C8>,
 <Element 'time' at 0x0000021EE0211638>,
 <Element 'time'

In [70]:
weatherXML[4].find("time")

<Element 'time' at 0x0000021EE01EE598>