## Setup: Installing Packages

In [1]:
#!conda install -c conda-forge yagmail selenium

## I. Basic Selenium

**Import selenium and open webdriver**

In [2]:
from selenium import webdriver

#Uncomment (comment) the following line if you're on Mac (PC).
#driver = webdriver.Chrome("./chromedriver")

#Uncomment (comment) the following line if you're on PC (Mac).
driver = webdriver.Chrome("./chromedriver.exe")

**Get webpage:** `driver.get("https://craigslist.org/")`

In [3]:
driver.get("https://craigslist.org/")

**Select element by link text:** `link = driver.find_element_by_link_text("best-of-craigslist")`

In [4]:
link = driver.find_element_by_link_text("best-of-craigslist")

In [5]:
link.text

'best-of-craigslist'

**Click on element:** `link.click()`

In [6]:
link.click()

**Select element(s) by CSS selector:** `driver.find_elements_by_css_selector('.date')`

In [7]:
dates = driver.find_elements_by_css_selector('.date')
print([date.text for date in dates])

['12 Sep 2020', '29 Mar 2020', '8 Jan 2020', '30 Oct 2019', '11 Oct 2019', '11 Sep 2019', '6 Aug 2019', '31 May 2019', '28 May 2019', '4 May 2019', '5 Feb 2019', '27 Jan 2019', '20 Jan 2019', '10 Jan 2019', '4 Dec 2018', '17 Oct 2018', '6 Sep 2018', '23 Aug 2018', '26 Jul 2018', '19 Jun 2018', '8 Jun 2018', '6 Jun 2018', '20 May 2018', '16 May 2018', '6 May 2018']


## Exercises

---

**1. Make your browser navigate to** `https://newyork.craigslist.org/`

In [8]:
driver.get("https://newyork.craigslist.org")

**2. Select and click on the link for** `furniture`

In [9]:
driver.find_element_by_link_text("furniture").click()

**3. Select and click the link labeled** `owner` **to filter down the posts.**

In [10]:
driver.find_element_by_link_text("owner").click()

**4. Use a CSS selector to target the checkbox labeled** `posted today` **then click it.**

In [11]:
driver.find_element_by_css_selector('.postedToday').click()

**5. Select the title of the first result and save it as** `element`

Hint: Use `.find_element_by_css_selector()` to grab just the first one

In [12]:
results = driver.find_elements_by_css_selector(".result-title")

for item in results:
    print(item.text)

Media Cabinet / Tv Stand
*New* Anthropologie Nested Diamond Rug (5X8)
Full Bed Mattress
Full Bed Frame
Ikea - Bookcase / Stand
Counter Height Computer and Desk Stool, Black
Dining chair with armrests
VariDesk® Pro Plus 36 Standing Desk
West Elm Bistro Table
Beautiful ladder bookshelf!
58' Black Ultrasuede Loveseat
Storage frame and baskets
Storage frame and baskets
Solid wood classic chairs $20 each - moving sale!
Small cafe dining table
Night stand with drawers
Bathroom sink w/ vanity, faucet & towel holder
Free chair for pickup on UWS
Brand New Full Size Mattress in box
Dresser and optional bedside table
Restoration Hardware Glass Lamp w/ Shade
Two folding stools IKEA
Victorian style dark wood mirror and medicine cabinet set
Bombay & Co Mirror
Ikea Kitchen Cart - Discontinued - Great Shape
Interior Define - Jasper Chaise Lounge
Glass Lamp(s)
2 white China cabinet
Futon on UWS
Swivel Counter Chairs - White (Set of 2)
Wooden Stool
*New* Anthropologie Hand-Knotted Alia Rug (5x8)
BIG sol

**6. Select the search bar at the top of the page save it as** `search`

In [13]:
search = driver.find_element_by_css_selector("#query")
search.click()

## II. Text with Selenium

**Get inner text from HTML element:** `element.text`

In [14]:
element = driver.find_element_by_css_selector(".result-title")
element.text

'Media Cabinet / Tv Stand'

**Get attribute from HTML element:** `element.get_attribute('href')`

In [15]:
element.get_attribute('href')

'https://newyork.craigslist.org/brx/fuo/d/bronx-media-cabinet-tv-stand/7204902610.html'

**Input text into field:** `search.send_keys('end table')`

In [16]:
search.send_keys("end table")

**Take a screenshot:** `driver.save_screenshot("warmup.png")`



In [17]:
driver.save_screenshot("warmup.png")

True

## Exercises

---
**1. Use CSS selectors to target the min and max price fields. Store them as** `min` **and** `max`

In [18]:
min_ = driver.find_element_by_css_selector('input.min')
max_ = driver.find_element_by_css_selector('input.max')

**2. Use** `.send_keys()` **on** `max` **and** `min` **to input 5 as minimum and 20 as maximum**



In [19]:
min_.send_keys('5')
max_.send_keys('20')

**3. Use a CSS selector to target the search button and click it.**

In [20]:
driver.find_element_by_css_selector('.searchbtn').click()

**4. Take a screenshot of the page with the search results**

In [21]:
driver.save_screenshot("LiveCoding.png")

True

**5. Select the first result and extract the href attribute**

In [22]:
print(driver.find_element_by_css_selector('.result-title'). \
      get_attribute('href'))

https://newyork.craigslist.org/que/fuo/d/brooklyn-rolling-end-table/7204779327.html


## III. File I/O and Python Review

**Read lines from file**

```python
with open('items.csv') as f:
    for line in f.readlines():
        print(line)
```

In [23]:
with open('items.csv') as f:
    for line in f.readlines():
        print(line)

coffee table,furniture,5,15

triumph,motorcycles,800,20000



**Splitting a string:** `"gucci belt,clothes+acc,2,20".split(",")`

**Write to a file**

```python
with open('results.csv', 'a') as f:
    f.write("test,test,1,2\n")
```

## Exercises

---

**1. Read in the file** `items.csv` **and use** `.split(",")` **on each line within the loop**

In [24]:
with open('items.csv') as f:
    # For-loop over f.readlines(). Remember to indent!
    for line in f.readlines():   
        # Each line needs to be split()
        line_list = line.split(',')

**2. Save each item in the resulting array as** `description`,`category`,`min`,`max`

In [25]:
with open('items.csv') as f:
    # For-loop over f.readlines(). Remember to indent!
    for line in f.readlines():
        
        # Each line needs to be split()
        line_list = line.split(',')
        description = line_list[0]
        category = line_list[1]
        min_ = line_list[2]
        max_ = line_list[3]

**3. Copy/Paste your code from the previous set of exercises into the loop, so that your program performs a search based on each line in the file.**


In [26]:
results = []
with open('items.csv') as f:
    # For-loop over f.readlines(). Remember to indent!
    for line in f.readlines():
        
        # Each line needs to be split()
        line_list = line.split(',')
        print(line_list)
        description = line_list[0]
        category = line_list[1]
        min_ = line_list[2]
        max_ = line_list[3]

        driver.get("https://newyork.craigslist.org")
        driver.find_element_by_link_text(category).click()
        
        min_price = driver.find_element_by_css_selector('input.min')
        max_price = driver.find_element_by_css_selector('input.max')
        min_price.send_keys(min_)
        max_price.send_keys(max_)       
        
        search = driver.find_element_by_css_selector("#query")
        search.send_keys(description)
        driver.find_element_by_css_selector('.searchbtn').click()
        
        for link in driver.find_elements_by_css_selector(".result-title"):
            results.append(link.get_attribute("href"))

['coffee table', 'furniture', '5', '15\n']
['triumph', 'motorcycles', '800', '20000\n']


## IV. Sending e-mail

**Import yagmail package:** `import yagmail`

In [29]:
import yagmail
import getpass

**Sender username and password**: `yag = yagmail.SMTP('automatedalertbot', wifi_password_goes_here)`

In [30]:
user = getpass.getpass('User name: ')
passw = getpass.getpass("Password: ")

User name: ········
Password: ········


In [31]:
yag = yagmail.SMTP(user, passw)

**Send message:** `yag.send(your_address_goes_here, 'this is my subject', "this is my message")`

Note: If you want to text your phone instead, you can use the table below to find the e-mail address to text.

In [32]:
tgt_email = getpass.getpass('Full target email address: ')

Full target email address: ········


In [33]:
yag.send(tgt_email, 'Test', "Testing 123")

{}

## Exercises

---

**1. Use yagmail to send matching links to your e-mail address, instead of writing to file.**

In [34]:
yag.send(tgt_email, 'Automation Results', results)

{}

**2. Use the table below to find your phone carrier, and send the matching links to your phone.**

In [35]:
phone = getpass.getpass('Phone number (only digits): ')
domain = 'vzwpix.com'
yag.send(phone+'@'+domain, 'Check out this stuff on Craigslist', results)

Phone number (only digits): ········


{}

<table class="styled" style="width: 547.365px;" border="0" align="center">
<tbody>
<tr>
<td style="text-align: center; width: 145px;"><strong>Carrier</strong></td>
<td style="text-align: center; width: 275px;"><strong>SMS gateway domain</strong></td>
<td style="text-align: center; width: 275px;"><strong>MMS gateway domain</strong></td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Alltel</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@message.alltel.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@mms.alltelwireless.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>AT&amp;T</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@txt.att.net</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@mms.att.net</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Boost Mobile</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@myboostmobile.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@myboostmobile.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Cricket Wireless</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@sms.cricketwireless.net</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@mms.cricketwireless.net</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Project Fi</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@msg.fi.google.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@msg.fi.google.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Sprint</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@messaging.sprintpcs.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@pm.sprint.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>T-Mobile</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@tmomail.net</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@tmomail.net</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>U.S. Cellular</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@email.uscc.net</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@mms.uscc.net</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Verizon</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@vtext.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@vzwpix.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Virgin Mobile</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@vmobl.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@vmpix.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Republic Wireless</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digital number]@text.republicwireless.com</td>
<td style="text-align: center; width: 275px;"></td>
</tr>
</tbody>
</table>

## With Improvements

In [41]:
results=[]

with open('items.csv') as f:
    # For-loop over f.readlines(). Remember to indent!
    for line in f.readlines():
        
        # Each line needs to be split()
        line_list = line.split(',')
        description = line_list[0]
        category = line_list[1]
        min_ = line_list[2]
        max_ = line_list[3]

        driver.get("https://newyork.craigslist.org")
        driver.find_element_by_link_text(category).click()
        
        min_price = driver.find_element_by_css_selector('input.min')
        max_price = driver.find_element_by_css_selector('input.max')
        min_price.send_keys(min_)
        max_price.send_keys(max_)
        
        distance = driver.find_element_by_css_selector('.search_distance')
        postal = driver.find_element_by_css_selector('.postal')
        distance.send_keys("5")
        postal.send_keys("11238")
        
        
        search = driver.find_element_by_css_selector("#query")
        search.send_keys(description)
        driver.find_element_by_css_selector('.searchbtn').click()
        
        scam_words = ['rare', 'antique', 'pristine', 'sale!!']
        
        for link in driver.find_elements_by_css_selector(".result-title")[:5]:
            scam = False
            for word in scam_words:
                if word in link.text.lower():
                    scam = True
                    print("SCAM!  " + link.text)
            if not scam:
                results.append(link.get_attribute("href"))
print('\n\n'.join([result for result in results]))

https://newyork.craigslist.org/que/fuo/d/brooklyn-parsons-coffee-table/7201559740.html

https://newyork.craigslist.org/brk/fuo/d/brooklyn-coffee-table-dark-wood/7203404850.html

https://newyork.craigslist.org/brk/fuo/d/brooklyn-ikea-lack-coffee-table/7203303447.html

https://newyork.craigslist.org/mnh/fuo/d/new-york-coffee-table/7203264702.html

https://newyork.craigslist.org/brk/fuo/d/brooklyn-moving-sale-storage-bed-lamps/7203167973.html

https://newyork.craigslist.org/mnh/mcy/d/new-york-triumph-tr6-tiger-650/7204149010.html

https://newyork.craigslist.org/lgi/mcy/d/long-island-city-2009-triumph-bonneville/7201183758.html

https://newyork.craigslist.org/mnh/mcy/d/new-york-2013-triumph-speed-triple/7199641252.html

https://newyork.craigslist.org/brk/mcy/d/brooklyn-2012-triumph-speed-triple-with/7197399921.html

https://newyork.craigslist.org/brk/mcy/d/long-island-city-2018-triumph-thruxton/7195654904.html
