# How can I get my data if I do not have a file?

<img src="img/web-api.png" style="width:600px">

<img src="img/whiteimage.png" style="width:650px">

## API

An API is an application programming interface. It is a set of rules that allow programs to talk to each other. The developer creates the API on the server and allows the client to talk to it.

In [226]:
from IPython.display import HTML

HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/s7wmiS2mSXY" frameborder="0" allowfullscreen></iframe>')



`REST` determines how the API looks like. It stands for “Representational State Transfer”. It is a set of rules that developers follow when they create their API. One of these rules states that you should be able to get a piece of data (called a resource) when you link to a specific URL.

Each URL is called a **request** while the data sent back to you is called a **response**.

<img src="img/whiteimage.png" style="width:650px">

### Request

#### The endpoint

The URL is nothing more than the way to make the request. Let's talk now about what the structure is:

* Base URL

This is the beginning of the request URL. Here, you basically inform the domain that repeats itself in any request. For example:
https://jsonplaceholder.typicode.com

* Resource or Path

The resource is the type of information you are looking for. Let's simulate that we are looking to know about posts, so we add the posts resource:
https://jsonplaceholder.typicode.com/posts

* Query String

The query string contains the parameters of that request. So, if I wanted to know the posts from the user with id 1, I would include these parameters ?userId=1 and our URL would look like this:
https://jsonplaceholder.typicode.com/posts?userId=1

As you can see above, because these are URL parameters, you use (?) and if you want to use more than one parameter you use (&).
https://jsonplaceholder.typicode.com/posts?userId=1&id=2

Note: The Query String is not only used for filters. It can be used as paging parameters, versioning, sorting, and more.

<img src="img/whiteimage.png" style="width:650px">

### Response

#### Code
**200** OK. 

**301** Moved Permanently.

**400** Bad Request.

**401** Unauthorized Error.

**403** Forbidden.

**404** Not Found.

**503** Service Unavailable.

In [227]:
import requests #https://requests.readthedocs.io/en/master/
response = requests.get("http://api.open-notify.org/this-api-doesnt-exist")
print(response.status_code)

404


In [228]:
response = requests.get("https://jsonplaceholder.typicode.com/posts?userId=1")
print(response.status_code)

200


<img src="img/whiteimage.png" style="width:650px">

#### Method


The method helps you to inform the type of action you are taking in that request.

Among the main methods, we have:

* Get (Search data)
* Post (Send data)
* Put and Patch (Update Data)
* Delete (Delete data)

<img src="img/whiteimage.png" style="width:650px">

#### Get Data response types

##### JSON https://www.json.org/json-en.html

JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. 

JSON is built on two structures:
* A collection of name/value pairs. In various languages, this is realized as an object, record, struct, dictionary, hash table, keyed list, or associative array.

* An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.

Example:
```
[
  {
    "userId": 1,
    "id": 1,
    "title": "sunt aut facere repellat provident occaecati excepturi optio reprehenderit",
    "body": "quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto"
  }
]
```

In [24]:
response = requests.get("https://jsonplaceholder.typicode.com/posts?userId=1")

print(response.content)

b'[\n  {\n    "userId": 1,\n    "id": 1,\n    "title": "sunt aut facere repellat provident occaecati excepturi optio reprehenderit",\n    "body": "quia et suscipit\\nsuscipit recusandae consequuntur expedita et cum\\nreprehenderit molestiae ut ut quas totam\\nnostrum rerum est autem sunt rem eveniet architecto"\n  },\n  {\n    "userId": 1,\n    "id": 2,\n    "title": "qui est esse",\n    "body": "est rerum tempore vitae\\nsequi sint nihil reprehenderit dolor beatae ea dolores neque\\nfugiat blanditiis voluptate porro vel nihil molestiae ut reiciendis\\nqui aperiam non debitis possimus qui neque nisi nulla"\n  },\n  {\n    "userId": 1,\n    "id": 3,\n    "title": "ea molestias quasi exercitationem repellat qui ipsa sit aut",\n    "body": "et iusto sed quo iure\\nvoluptatem occaecati omnis eligendi aut ad\\nvoluptatem doloribus vel accusantium quis pariatur\\nmolestiae porro eius odio et labore et velit aut"\n  },\n  {\n    "userId": 1,\n    "id": 4,\n    "title": "eum et est occaecati",

In [23]:
print(response.text)

[
  {
    "userId": 1,
    "id": 1,
    "title": "sunt aut facere repellat provident occaecati excepturi optio reprehenderit",
    "body": "quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto"
  },
  {
    "userId": 1,
    "id": 2,
    "title": "qui est esse",
    "body": "est rerum tempore vitae\nsequi sint nihil reprehenderit dolor beatae ea dolores neque\nfugiat blanditiis voluptate porro vel nihil molestiae ut reiciendis\nqui aperiam non debitis possimus qui neque nisi nulla"
  },
  {
    "userId": 1,
    "id": 3,
    "title": "ea molestias quasi exercitationem repellat qui ipsa sit aut",
    "body": "et iusto sed quo iure\nvoluptatem occaecati omnis eligendi aut ad\nvoluptatem doloribus vel accusantium quis pariatur\nmolestiae porro eius odio et labore et velit aut"
  },
  {
    "userId": 1,
    "id": 4,
    "title": "eum et est occaecati",
    "body": "ullam et saepe reic

In [22]:
print(response.json())

[{'userId': 1, 'id': 1, 'title': 'sunt aut facere repellat provident occaecati excepturi optio reprehenderit', 'body': 'quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto'}, {'userId': 1, 'id': 2, 'title': 'qui est esse', 'body': 'est rerum tempore vitae\nsequi sint nihil reprehenderit dolor beatae ea dolores neque\nfugiat blanditiis voluptate porro vel nihil molestiae ut reiciendis\nqui aperiam non debitis possimus qui neque nisi nulla'}, {'userId': 1, 'id': 3, 'title': 'ea molestias quasi exercitationem repellat qui ipsa sit aut', 'body': 'et iusto sed quo iure\nvoluptatem occaecati omnis eligendi aut ad\nvoluptatem doloribus vel accusantium quis pariatur\nmolestiae porro eius odio et labore et velit aut'}, {'userId': 1, 'id': 4, 'title': 'eum et est occaecati', 'body': 'ullam et saepe reiciendis voluptatem adipisci\nsit amet autem assumenda provident rerum culpa\nquis hic c

In [25]:
print(type(response.content))
print(type(response.json()))
print(type(response.text))

<class 'bytes'>
<class 'list'>
<class 'str'>


In [38]:
print(response.json()[0], '\n')
print(response.json()[0]['title'], '\n')

for elem in response.json():
    print(elem['title'])

{'userId': 1, 'id': 1, 'title': 'sunt aut facere repellat provident occaecati excepturi optio reprehenderit', 'body': 'quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto'} 

sunt aut facere repellat provident occaecati excepturi optio reprehenderit 

sunt aut facere repellat provident occaecati excepturi optio reprehenderit
qui est esse
ea molestias quasi exercitationem repellat qui ipsa sit aut
eum et est occaecati
nesciunt quas odio
dolorem eum magni eos aperiam quia
magnam facilis autem
dolorem dolore est ipsam
nesciunt iure omnis dolorem tempora et accusantium
optio molestias id quia eum


<img src="img/whiteimage.png" style="width:650px">

##### XML   

XML stands for "Extensible Markup Language". It is mainly used in webpages, where the data has a specific structure and is understood dynamically by the XML framework.

XML creates a tree-like structure that is easy to interpret and supports a hierarchy. Whenever a page follows XML, it can be called an XML document.

* XML documents have sections, called elements, defined by a beginning and an ending tag. A tag is a markup construct that begins with < and ends with >. The characters between the start-tag and end-tag, if there are any, are the element's content. Elements can contain markup, including other elements, which are called "child elements".


* The largest, top-level element is called the root, which contains all other elements.


* Attributes are name–value pair that exist within a start-tag or empty-element tag. An XML attribute can only have a single value and each attribute can appear at most once on each element.


<img src="img/whiteimage.png" style="width:650px">

###### Elements

**Tag**	It is a string representing the type of data being stored

**Attributes**	Consists of a number of attributes stored as dictionaries

**Text String**	A text string having information that needs to be displayed

**Child Elements**	Consists of a number of  child elements stored as sequences

Below you can see an example:

~~~~
<?xml version="1.0" encoding="UTF-8"?>
<metadata>
<food>
    <item name="breakfast">Idly</item>
    <price>$2.5</price>
    <description>
   Two idly's with chutney
   </description>
    <calories>553</calories>
</food>
<food>
    <item name="breakfast">Paper Dosa</item>
    <price>$2.7</price>
    <description>
    Plain paper dosa with chutney
    </description>
    <calories>700</calories>
</food>
<food>
    <item name="breakfast">Upma</item>
    <price>$3.65</price>
    <description>
    Rava upma with bajji
    </description>
    <calories>600</calories>
</food>
<food>
    <item name="breakfast">Bisi Bele Bath</item>
    <price>$4.50</price>
    <description>
   Bisi Bele Bath with sev
    </description>
    <calories>400</calories>
</food>
<food>
    <item name="breakfast">Kesari Bath</item>
    <price>$1.95</price>
    <description>
    Sweet rava with saffron
    </description>
    <calories>950</calories>
</food>
</metadata>
~~~~

In [229]:
r = requests.get('http://www.mocky.io/v2/5e7a343330000066749308b2') 
#https://www.mocky.io/
#https://www.edureka.co/blog/python-xml-parser-tutorial/
print(r.content)

b'<?xml version="1.0" encoding="UTF-8"?>\n<metadata>\n<food>\n    <item name="breakfast">Idly</item>\n    <price>$2.5</price>\n    <description>\n   Two idly\'s with chutney\n   </description>\n    <calories>553</calories>\n</food>\n<food>\n    <item name="breakfast">Paper Dosa</item>\n    <price>$2.7</price>\n    <description>\n    Plain paper dosa with chutney\n    </description>\n    <calories>700</calories>\n</food>\n<food>\n    <item name="breakfast">Upma</item>\n    <price>$3.65</price>\n    <description>\n    Rava upma with bajji\n    </description>\n    <calories>600</calories>\n</food>\n<food>\n    <item name="breakfast">Bisi Bele Bath</item>\n    <price>$4.50</price>\n    <description>\n   Bisi Bele Bath with sev\n    </description>\n    <calories>400</calories>\n</food>\n<food>\n    <item name="breakfast">Kesari Bath</item>\n    <price>$1.95</price>\n    <description>\n    Sweet rava with saffron\n    </description>\n    <calories>950</calories>\n</food>\n</metadata>'


In [230]:
import xml.etree.ElementTree as ET
root = ET.fromstring(r.content) #https://docs.python.org/3/library/xml.etree.elementtree.html
print(root)
print(root.tag)
print(root.attrib)

<Element 'metadata' at 0x114a85a98>
metadata
{}


In [231]:
#print first element tag
print(root[0].tag)

food


In [232]:
#print elements tag
for child in root:
    print(child.tag)

food
food
food
food
food


In [233]:
#get elements information
for x in root[0]:
     print('tag: ', x.tag, 'attrib: ', x.attrib, 'text: ', x.text)

tag:  item attrib:  {'name': 'breakfast'} text:  Idly
tag:  price attrib:  {} text:  $2.5
tag:  description attrib:  {} text:  
   Two idly's with chutney
   
tag:  calories attrib:  {} text:  553


In [234]:
for child in root:
    for elem in child:
        print(elem.tag, elem.text)
    print()

item Idly
price $2.5
description 
   Two idly's with chutney
   
calories 553

item Paper Dosa
price $2.7
description 
    Plain paper dosa with chutney
    
calories 700

item Upma
price $3.65
description 
    Rava upma with bajji
    
calories 600

item Bisi Bele Bath
price $4.50
description 
   Bisi Bele Bath with sev
    
calories 400

item Kesari Bath
price $1.95
description 
    Sweet rava with saffron
    
calories 950



In [235]:
for x in root.findall('food'):
    item =x.find('item').text
    price = x.find('price').text
    print(item, price)

Idly $2.5
Paper Dosa $2.7
Upma $3.65
Bisi Bele Bath $4.50
Kesari Bath $1.95


In [236]:
for x in root.findall('food'):
    for item in x.findall('item'):
        print (item.text)

Idly
Paper Dosa
Upma
Bisi Bele Bath
Kesari Bath


<img src="img/whiteimage.png" style="width:650px">

So we can GET data in JSON or XML format, but we can also
* Post (Send data)
* Put and Patch (Update Data)
* Delete (Delete data)



##### Post

In [127]:
url = 'https://jsonplaceholder.typicode.com/posts'
myobj = {'title': 'foo', 'body': 'bar', 'userId': 1}

r = requests.post(url, data = myobj)

print(r.status_code, r.json()) #or x.text

201 {'title': 'foo', 'body': 'bar', 'userId': '1', 'id': 101}


##### Put

In [135]:
r = requests.put('https://jsonplaceholder.typicode.com/posts/1', data = {'title': 'foo','body': 'bar','userId': 1})

print(r.status_code, r.json()) #or x.text

200 {'title': 'foo', 'body': 'bar', 'userId': '1', 'id': 1}


##### Delete

In [137]:
r = requests.delete('https://jsonplaceholder.typicode.com/posts/1')

print(r.status_code, r.json())

200 {}


<img src="img/whiteimage.png" style="width:650px">

___

### How can I provide my own data as a REST API?

You can use flask to create an API and expose the endpoints (for reference https://programminghistorian.org/en/lessons/creating-apis-with-python-and-flask)

You can find the next code in the python file server_example.py
```
import flask
from flask import request, jsonify

app = flask.Flask(__name__)
app.config["DEBUG"] = True

# Create some test data for our catalog in the form of a list of dictionaries.
books = [
    {'id': 0,
     'title': 'A Fire Upon the Deep',
     'author': 'Vernor Vinge',
     'first_sentence': 'The coldsleep itself was dreamless.',
     'year_published': '1992'},
    {'id': 1,
     'title': 'The Ones Who Walk Away From Omelas',
     'author': 'Ursula K. Le Guin',
     'first_sentence': 'With a clamor of bells that set the swallows soaring, the Festival of Summer came to the city Omelas, bright-towered by the sea.',
     'published': '1973'},
    {'id': 2,
     'title': 'Dhalgren',
     'author': 'Samuel R. Delany',
     'first_sentence': 'to wound the autumnal city.',
     'published': '1975'}
]


@app.route('/')
def hello_world():
    return jsonify([{'Hello': 'World!'}])


# A route to return all of the available entries in our catalog.
@app.route('/api/v1/resources/books/all', methods=['GET'])
def api_all():
    return jsonify(books)

app.run()

```

Then run `python myfile.py`

In [113]:
r = requests.get('http://127.0.0.1:5000')
print(r.status_code)
print(r.json())

200
[{'Hello': 'World!'}]


In [114]:
r = requests.get('http://127.0.0.1:5000/api/v1/resources/books/all')
print(r.status_code)
print(r.json())

200
[{'author': 'Vernor Vinge', 'first_sentence': 'The coldsleep itself was dreamless.', 'id': 0, 'title': 'A Fire Upon the Deep', 'year_published': '1992'}, {'author': 'Ursula K. Le Guin', 'first_sentence': 'With a clamor of bells that set the swallows soaring, the Festival of Summer came to the city Omelas, bright-towered by the sea.', 'id': 1, 'published': '1973', 'title': 'The Ones Who Walk Away From Omelas'}, {'author': 'Samuel R. Delany', 'first_sentence': 'to wound the autumnal city.', 'id': 2, 'published': '1975', 'title': 'Dhalgren'}]


#### References 

https://towardsdatascience.com/twitter-data-collection-tutorial-using-python-3267d7cfa93e (highly recommended)

https://www.datacamp.com/community/tutorials/python-xml-elementtree

https://sdbrett.com/BrettsITBlog/2017/03/python-parsing-api-xml-response-data/

<img src="img/whiteimage.png" style="width:650px">

___
## Web Scraping

What can I do if the data I need is not available trough APIs?

Install conda using:
`pip3 install selenium`
or
`conda install -c conda-forge selenium`

Download the driver for your Chrome version https://chromedriver.chromium.org/downloads

In [155]:
from selenium import webdriver

In [156]:
wd = webdriver.Chrome(executable_path=r"/Users/edwin.jimenez/Saturdays AI/chromedriver")

In [157]:
wd.get('https://dev.to/lewiskori/beginner-s-guide-to-web-scraping-with-python-s-selenium-3fl9') 

<img src="img/whiteimage.png" style="width:650px">

Before we go further we need to know Xpath

### What is XPath? [reference](https://www.guru99.com/xpath-selenium.html)

XPath is defined as XML path. It is a syntax or language for finding any element on the web page using XML path expression. XPath is used to find the location of any element on a webpage using HTML DOM (The Document Object Model is a cross-platform and language-independent interface that treats an XML or HTML document as a tree structure wherein each node is an object representing a part of the document. [reference](https://en.wikipedia.org/wiki/Document_Object_Model)) structure. The basic format of XPath is explained below with screenshot. 
<img src="https://www.guru99.com/images/3-2016/032816_0758_XPathinSele1.png" style="width:600px">

#### Syntax for XPath:

XPath contains the path of the element situated at the web page. Standard syntax for creating XPath is.

`Xpath=//tagname[@attribute='value']`

- // : Select current node.

- Tagname: Tagname of the particular node.

- @: Select attribute.

- Attribute: Attribute name of the node.

- Value: Value of the attribute. 

To find the element on web pages accurately there are different types of locators:

| Xpath Locators 	| Find different elements on web page                                                                  	|
|--------------	|------------------------------------------------------------------------------------------------------	|
| ID             	| To find the element by ID of the element                                                             	|
| Classname      	| To find the element by Classname of the element                                                      	|
| Link text      	| To find the element by text of the link                                                              	|
| XPath          	| XPath required for finding the dynamic element and traverse between various elements of the web page 	|

<img src="img/whiteimage.png" style="width:650px">

#### Types of Xpath

* Absolute XPath

It is the direct way to find the element, but the disadvantage of the absolute XPath is that if there are any changes made in the path of the element then that XPath gets failed.

The key characteristic of XPath is that it begins with the single forward slash(/), which means you can select the element from the root node.

For example:
`/html/body/div[2]/section[3]/div/div[1]/main/div[1]/div/div/div/div/div/div[2]/p[3]`

* Relative Xpath

For Relative Xpath the path starts from the middle of the HTML DOM structure. It starts with the double forward slash (//), which means it can search the element anywhere at the webpage.

You can start from the middle of the HTML DOM structure and no need to write long xpath.

For example, for the same element that the absolute Xpath refers to this is the relative Xpath
`//*[@id="g-mainbar"]/div[1]/div/div/div/div/div/div[2]/p[3]`

Try it!
* Go to your Chrome window. 
* Do right-click and select `Inspect`, look for an element. 
* Do right-click and select copy, choose to copy Xpath or full Xpath.



<img src="img/whiteimage.png" style="width:650px">

#### Example

In [158]:
wd.get('http://quotes.toscrape.com/page/1/') 

In [162]:
quotes = wd.find_elements_by_class_name("quote")
for quote in quotes:
    print(quote)

<selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="249b68d5-0fea-4668-ab0a-42896d1147e6")>
<selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="7d40f4d7-2e54-4ccb-a179-8942650979cd")>
<selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="bc4b6693-1d3a-4ed2-ae5e-e43e51c6c40e")>
<selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="67e7ccf5-344f-48ab-877a-80c400f85cb5")>
<selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="aaaddcaf-bfe8-420e-8618-0ea987cf88f5")>
<selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="13fe9eab-e797-407e-b65c-08a44805ba8c")>
<selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="5bfca53c-22d1-47ee-9fe7-294f6a7dba72")>

In [168]:
quotes = wd.find_elements_by_class_name("quote")
for quote in quotes:
    quote_text = quote.find_element_by_class_name('text')
    author = quote.find_element_by_class_name('author')
    tags = quote.find_element_by_class_name('tags')
    print(quote_text, author, tags)

<selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="11ab8303-6716-43e4-96d8-01404d3037fe")> <selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="2eb7a7fe-7720-4378-97b3-c58c16c1dc97")> <selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="6ed7f67d-7fa9-49ff-bd2b-34d113bb212e")>
<selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="d38b6284-81c9-4372-93d6-c63376f909b2")> <selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="b7e9bc74-78c6-4130-9fb9-68d5202b06b3")> <selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="03ae84d4-9d89-4ff5-8257-e401b7518adf")>
<selenium.webdriver.remote.webelement.WebElement (session="080972838bb4b03c932e5b131bef2953", element="6a50ac69-de8b-4fe4-be22-32a5944f7a1b")>

In [171]:
#we want to access the text value
quotes = wd.find_elements_by_class_name("quote")
for quote in quotes:
    quote_text = quote.find_element_by_class_name('text').text
    author = quote.find_element_by_class_name('author').text
    tags = quote.find_element_by_class_name('tags').text
    print(quote_text, '\n', author, '\n', tags, '\n')

“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.” 
 Albert Einstein 
 Tags: change deep-thoughts thinking world 

“It is our choices, Harry, that show what we truly are, far more than our abilities.” 
 J.K. Rowling 
 Tags: abilities choices 

“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.” 
 Albert Einstein 
 Tags: inspirational life live miracle miracles 

“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.” 
 Jane Austen 
 Tags: aliteracy books classic humor 

“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.” 
 Marilyn Monroe 
 Tags: be-yourself inspirational 

“Try not to become a man of success. Rather become a man of value.” 
 Albert Einstein 
 Tags: adulthood success value 

“It is better to be hated for what you are tha

In [180]:
#### what if we cant the quotes of the first 10 pages?
def get_quotes(wd):
    for i in range(1,11):
        wd.get('http://quotes.toscrape.com/page/'+str(i)+'/')
        quotes = wd.find_elements_by_class_name("quote")
        for quote in quotes:
            quote_text = quote.find_element_by_class_name('text').text
            author = quote.find_element_by_class_name('author').text
            tags = quote.find_element_by_class_name('tags').text
            print(quote_text, '\n', author, '\n', tags, '\n')

In [None]:
get_quotes(wd)

In [182]:
from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument('--headless')
wd = webdriver.Chrome(executable_path=r"/Users/edwin.jimenez/Saturdays AI/chromedriver", chrome_options=options)
get_quotes(wd)
wd.quit()

  """


“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.” 
 Albert Einstein 
 Tags: change deep-thoughts thinking world 

“It is our choices, Harry, that show what we truly are, far more than our abilities.” 
 J.K. Rowling 
 Tags: abilities choices 

“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.” 
 Albert Einstein 
 Tags: inspirational life live miracle miracles 

“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.” 
 Jane Austen 
 Tags: aliteracy books classic humor 

“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.” 
 Marilyn Monroe 
 Tags: be-yourself inspirational 

“Try not to become a man of success. Rather become a man of value.” 
 Albert Einstein 
 Tags: adulthood success value 

“It is better to be hated for what you are tha

“It matters not what someone is born, but what they grow to be.” 
 J.K. Rowling 
 Tags: dumbledore 

“Love does not begin and end the way we seem to think it does. Love is a battle, love is a war; love is a growing up.” 
 James Baldwin 
 Tags: love 

“There is nothing I would not do for those who are really my friends. I have no notion of loving people by halves, it is not my nature.” 
 Jane Austen 
 Tags: friendship love 

“Do one thing every day that scares you.” 
 Eleanor Roosevelt 
 Tags: attributed fear inspiration 

“I am good, but not an angel. I do sin, but I am not the devil. I am just a small girl in a big world trying to find someone to love.” 
 Marilyn Monroe 
 Tags: attributed-no-source 

“If I were not a physicist, I would probably be a musician. I often think in music. I live my daydreams in music. I see my life in terms of music.” 
 Albert Einstein 
 Tags: music 

“If you only read the books that everyone else is reading, you can only think what everyone else is thinkin

“A person's a person, no matter how small.” 
 Dr. Seuss 
 Tags: inspirational 

“... a mind needs books as a sword needs a whetstone, if it is to keep its edge.” 
 George R.R. Martin 
 Tags: books mind 



<img src="img/whiteimage.png" style="width:650px">

Imagine that you want to know daily which one is the best mexican bank to buy or sell dollars, but the site you visit [el dolar](https://www.eldolar.info/en/mexico/dia/hoy) does not have an API :(

How can you get the data?

In [222]:
from selenium.webdriver.common.by import By

wd = webdriver.Chrome(executable_path=r"/Users/edwin.jimenez/Saturdays AI/chromedriver")#, chrome_options=options)
wd.get('https://www.eldolar.info/en/mexico/dia/hoy')
menu = wd.find_element_by_tag_name('tbody')
all_tr = menu.find_elements_by_tag_name("tr")
for tr in all_tr:
    bank_name = tr.find_element_by_class_name('small-hide').text
    buy_sell = tr.find_elements_by_class_name('xTimes')
    if len(buy_sell)>1:
        buy = buy_sell[0].text
        sell = buy_sell[1].text
    else:
        buy = buy_sell[0].text
        sell = buy_sell[0].text
    print('Bank: ', bank_name, 'Buy: ', buy, 'Sell:', sell)
wd.quit()

Bank:  Banamex Buy:  24.19 Sell: 25.19
Bank:  Banco Azteca Buy:  21.50 Sell: 24.59
Bank:  Banco de México - FIX Buy:  25.1185 Sell: 25.1185
Bank:  Banco de México - Interbancario 48 horas Buy:  24.976 Sell: 24.976
Bank:  Banorte Buy:  23.55 Sell: 25.15
Bank:  BBVA Bancomer Buy:  23.93 Sell: 25.18
Bank:  DOF, Diario Oficial de la Federación Buy:  25.0782 Sell: 25.0782
Bank:  Inbursa Buy:  24.40 Sell: 25.40
Bank:  Intercam Buy:  24.37 Sell: 25.06
Bank:  IXE Buy:  23.55 Sell: 25.15
Bank:  Monex Buy:  24.52 Sell: 25.10
Bank:  Para pagos de obligaciones Buy:  24.1113 Sell: 24.1113
Bank:  SAT, Servicio de Administración Tributaria Buy:  25.0782 Sell: 25.0782
Bank:  Scotiabank Buy:  20.00 Sell: 25.00


<img src="img/whiteimage.png" style="width:650px">

---
Let's try it with a page that you know, get the top navigation bar elements by class


In [210]:
from selenium.webdriver.common.by import By

wd = webdriver.Chrome(executable_path=r"/Users/edwin.jimenez/Saturdays AI/chromedriver")#, chrome_options=options)
wd.get('https://sites.google.com/view/saturdays-ai-guadalajara')
menu = wd.find_elements_by_class_name('VsJjtf')
for elem in menu:
    print (elem.text)
wd.quit()

Página principal
Calendario de actividades
Aviso de Privacidad



In [197]:
#but what will happen if the class name is changed?
from selenium.webdriver.common.by import By

wd = webdriver.Chrome(executable_path=r"/Users/edwin.jimenez/Saturdays AI/chromedriver")#, chrome_options=options)
wd.get('https://sites.google.com/view/saturdays-ai-guadalajara')
menu = wd.find_element_by_xpath('//*[@id="WDxLfe"]/ul').text
print(menu)
wd.quit()

Página principalCalendario de actividadesAviso de Privacidad


In [213]:
wd = webdriver.Chrome(executable_path=r"/Users/edwin.jimenez/Saturdays AI/chromedriver")#, chrome_options=options)
wd.get('https://sites.google.com/view/saturdays-ai-guadalajara')
menu = wd.find_element_by_xpath('//*[@id="WDxLfe"]/ul')
all_li = menu.find_elements_by_tag_name("li")
for li in all_li:
    text = li.text
    if text != '':
        print (text)
wd.quit()


Página principal
Calendario de actividades
Aviso de Privacidad


<img src="img/whiteimage.png" style="width:650px">

### Homework 
Extract the `Programa` information from https://sites.google.com/view/saturdays-ai-guadalajara/calendario-de-actividades

#### References
https://dev.to/lewiskori/beginner-s-guide-to-web-scraping-with-python-s-selenium-3fl9

https://medium.com/shakuro/adopting-ipython-jupyter-for-selenium-testing-d02309dd00b8