## Setup: Installing Packages

In [1]:
#!conda install -c conda-forge yagmail selenium

## I. Basic Selenium

**Import selenium and open webdriver**

In [2]:
from selenium import webdriver
from selenium.webdriver.common.by import By

In [3]:
################################################################################
# Uncomment one of the following three lines corresponding to your OS
################################################################################
# Newer Mac with "Apple Silicone"
cService = webdriver.ChromeService(executable_path='./chromedriver_arm')

# Older Mac
#cService = webdriver.ChromeService(executable_path='./chromedriver') 

# Windows
#cService = webdriver.ChromeService(executable_path='./chromedriver.exe')
################################################################################

driver = webdriver.Chrome(service = cService)

# Whenever `driver` can't find an element, have it wait 1 second and try again.
driver.implicitly_wait(1)

# Other chromedrivers are available at 
# https://developer.chrome.com/docs/chromedriver/downloads

**Get webpage:** `driver.get("https://craigslist.org/")`

In [4]:
driver.get("https://craigslist.org/")

**Select element by link text:** `link = driver.find_element('link text', "best-of-craigslist")`

In [5]:
link = driver.find_element('link text', "best-of-craigslist")

In [6]:
link.text

'best-of-craigslist'

**Click on element:** `link.click()`

In [7]:
link.click()

**Select element(s) by CSS selector:** `driver.find_elements('css_selector', '.date')`

In [8]:
dates = driver.find_elements('css selector', '.date')
for date in dates:
    print(date.text)

19 May 2024
18 Jan 2024
17 Jan 2024
31 Oct 2023
28 Oct 2023
28 Sep 2023
3 Aug 2023
13 Jul 2023
30 Jun 2023
21 Jun 2023
17 Jun 2023
9 Jun 2023
8 Jun 2023
6 Jun 2023
2 Jun 2023
25 May 2023
17 Apr 2023
10 Apr 2023
19 Mar 2023
26 Dec 2022
28 Jun 2022
6 Aug 2021
4 Dec 2020
12 Sep 2020
29 Mar 2020


**Select something other than a link by its text:**  
You can use the element's XML path.
```python
driver.find_element(By.XPATH, "//tag[text()='element text']")
```
If the HTML element is `<a href="https://www.craigslist.org">CL</a>`, the tag is `a`, and the element text is `CL`.

In [9]:
driver.find_element(By.XPATH, "//a[text()='CL']").click()

## Exercises

---

**1. Make your browser navigate to** `https://newyork.craigslist.org/`

In [10]:
driver.get("https://newyork.craigslist.org")

**2. Select and click on the link for** `furniture`

In [11]:
driver.find_element('link text', "furniture").click()

**3. Use the** `owner` **button's XML path to filter down the posts.**

In [12]:
driver.find_element(By.XPATH, "//button[text()='owner']").click()

**4. Use a CSS selector to target the checkbox labeled** `posted today` **then click it.**

In [13]:
driver.find_element('css selector', '[name="postedToday"]').click()

**5. Print the text for all elements of the `.label` class.**

In [14]:
results = driver.find_elements('css selector', '.label')

for item in results:
    print(item.text)

CL
new york
all new york
for sale
furniture







safety tips
prohibited items
product recalls
avoiding scams

gallery
newest





























help
safety
privacy
terms
about
app




**6. Print the result titles**  
Notice that the titles of the results have the class `.label`. However, if you want to print only *result titles*, the `.label` class is too inclusive. See if the result titles are children of elements that belong to a class that only includes result titles.



*Hint:  
To select children of the element with id `hello` that are part of the `.example` class, you would use the CSS selector*
```css
#hello > .example
```

In [15]:
results = driver.find_elements('css selector', '.posting-title')

for item in results[:20]:
    print(item.text)

Corner Wooden Desk
eames herman miller fiberglass shell
Basi Oak Queen Bed Frame - Article
Architect drawing table
Moroccan Hanging Lamp
Wood Flat Floor Mirror
Wodden Table and Chairs
9x12 west elm wool rug
Set of Five Salterini Mid-Century Curved Rounded Benches Ex Cond.
Wooden Coffee Table
Reverie R650 Adjustable Bed Power Base
NEED GONE ASAP - FAUX BROWN LEATHER CHAIR
For Sale: Sturdy 2-Drawer TV Stand
Black Leather Sofa OBO
Mid-century modern style round wooden table with steel hairpin legs
Homary - Large Center console/TV Stand
Vintage Eames brown hopsack Aluminum group chair
Golden Bar Stools
!!Accent chairs, reception chairs, Dining chairs, Computer Leather
DWR Queen Min bed frame - powder coated white frame


**7. Select the search bar at the top of the page, save it as** `search` **, and click it.**  
You should see your cursor blinking in the search bar if you're successful

In [16]:
search = driver.find_element('css selector', '[enterkeyhint="search"]')
search.click()

## II. Text with Selenium

**Get inner text from HTML element:** `element.text`

In [17]:
element = driver.find_element('css selector', ".posting-title")
element.text

'Corner Wooden Desk'

**Get attribute from HTML element:** `element.get_attribute('href')`

In [18]:
element.get_attribute('href')

'https://newyork.craigslist.org/mnh/fuo/d/new-york-corner-wooden-desk/7761997754.html'

**Input text into field:** `search.send_keys('end table')`

In [19]:
search.clear()
search.send_keys("end table")

**Take a screenshot:** `driver.save_screenshot("warmup.png")`



In [20]:
driver.save_screenshot("warmup.png")

True

## Exercises

---
**1. Use CSS selectors to target the min and max price fields. Store them as** `min_field` **and** `max_field`

In [21]:
min_field = driver.find_element('css selector', '[placeholder="min"]')
max_field = driver.find_element('css selector', '[placeholder="max"]')

**2. Use** `.send_keys()` **on** `max_field` **and** `min_field` **to input 5 as minimum and 20 as maximum**



In [22]:
min_field.send_keys('5')
max_field.send_keys('20')

**3. Use a CSS selector to target the search button and click it.**

In [23]:
driver.find_element('css selector', '.cl-exec-search').click()

**4. Take a screenshot of the page with the search results**

In [24]:
driver.save_screenshot("LiveCoding.png")

True

**5. Select the first result and extract the href attribute**

In [25]:
print(driver.find_element(
    'css selector', 
    '.posting-title'
).get_attribute('href'))

https://newyork.craigslist.org/stn/fuo/d/staten-island-side-table-solid-wood/7761897383.html


## III. File I/O and Python Review

**Read lines from file**

In [26]:
with open('items.csv') as f:
    for line in f.readlines():
        print(line)

coffee table,furniture,5,15

triumph,motorcycles,800,20000



**Splitting a string:**

In [27]:
"gucci belt,clothes+acc,2,20".split(",")

['gucci belt', 'clothes+acc', '2', '20']

**Write to a file**

In [28]:
with open('results.csv', 'a') as f:
    f.write("test,test,1,2\n")

## Exercises

---

**1. Read in the file** `items.csv` **and use** `.split(",")` **on each line within the loop**

In [29]:
with open('items.csv') as f:
    # For-loop over f.readlines(). Remember to indent!
    for line in f.readlines():   
        #Interpret the comma separated values into a list by 
        #using `.split()`
        line_list = line.split(',')

**2. Save each item in the resulting array as** `description`,`category`,`minimum`,`maximum`

In [30]:
with open('items.csv') as f:
    # For-loop over f.readlines(). Remember to indent!
    for line in f.readlines():
        
        # Each line needs to be split()
        line_list = line.split(',')
        description, category, minimum, maximum = line_list

**3. Copy/Paste your code from previous exercises into the loop so that your program performs a search based on each line in the file.**  
Print the URLs of all the results

In [31]:
results = []
with open('items.csv') as f:
    # For-loop over f.readlines(). Remember to indent!
    for line in f.readlines():
        
        # Each line needs to be split()
        line_list = line.split(',')
        print(line_list)
        description, category, minimum, maximum = line_list

        driver.get("https://newyork.craigslist.org")
        driver.find_element('link text', category).click()
        
        min_price = driver.find_element('css selector', '[placeholder="min"]')
        max_price = driver.find_element('css selector', '[placeholder="max"]')
        min_price.send_keys(minimum)
        max_price.send_keys(maximum)       
        
        search = driver.find_element('css selector', '[enterkeyhint="search"]')
        search.send_keys(description)
        driver.find_element('css selector', '.cl-exec-search').click()
        
        for link in driver.find_elements('css selector', ".posting-title"):
            results.append(link.get_attribute("href"))
display(results)

['coffee table', 'furniture', '5', '15\n']
['triumph', 'motorcycles', '800', '20000\n']


['https://newyork.craigslist.org/que/fuo/d/little-neck-glass-octangular-coffee/7757661026.html',
 'https://newyork.craigslist.org/que/fuo/d/little-neck-coffee-table-antique-stone/7757662174.html',
 'https://newyork.craigslist.org/brk/fuo/d/brooklyn-wooden-coffee-table/7761987700.html',
 'https://newyork.craigslist.org/brx/fud/d/bronx-restoration-hardware-thaddeus/7761983548.html',
 'https://newyork.craigslist.org/mnh/fuo/d/new-york-vtg-80s-memphis-style-designer/7753226277.html',
 'https://newyork.craigslist.org/mnh/fuo/d/new-york-vintage-cute-small-handmade/7753426016.html',
 'https://newyork.craigslist.org/brk/fuo/d/brooklyn-modern-designer-coffee-table/7761977636.html',
 'https://newyork.craigslist.org/brk/fud/d/brooklyn-arhaus-square-coffee-table-was/7761977643.html',
 'https://newyork.craigslist.org/mnh/fuo/d/new-york-rattan-bamboo-end-table-glass/7755326246.html',
 'https://newyork.craigslist.org/que/fuo/d/flushing-rustic-coffee-table/7756286521.html',
 'https://newyork.craigslis

## IV. Sending e-mail
*Note: You must configure your Gmail account to allow Python to use it.* 
- *If you don't have 2FA configured, you'll need to [enable "less secure apps"](https://myaccount.google.com/lesssecureapps).*
- *If you do have 2FA configured you'll need to [make an app password for `yagmail`](https://myaccount.google.com/apppasswords)*

**Import yagmail package:**

In [32]:
import yagmail
import getpass

**Sender username and password**:  

In [33]:
user = getpass.getpass('User name: ')
passw = getpass.getpass("Password: ")

User name:  ········
Password:  ········


In [34]:
yag = yagmail.SMTP(user, passw)

**Send message:** `yag.send(your_address_goes_here, 'this is my subject', "this is my message")`

Note: If you want to text your phone instead, you can use the table below to find the e-mail address to text.
<table class="styled" style="width: 547.365px;" border="0" align="center">
<tbody>
<tr>
<td style="text-align: center; width: 145px;"><strong>Carrier</strong></td>
<td style="text-align: center; width: 275px;"><strong>SMS gateway domain</strong></td>
<td style="text-align: center; width: 275px;"><strong>MMS gateway domain</strong></td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Alltel</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@message.alltel.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@mms.alltelwireless.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>AT&amp;T</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@txt.att.net</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@mms.att.net</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Boost Mobile</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@myboostmobile.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@myboostmobile.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Cricket Wireless</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@sms.cricketwireless.net</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@mms.cricketwireless.net</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Project Fi</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@msg.fi.google.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@msg.fi.google.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Sprint</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@messaging.sprintpcs.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@pm.sprint.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>T-Mobile</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@tmomail.net</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@tmomail.net</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>U.S. Cellular</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@email.uscc.net</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@mms.uscc.net</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Verizon</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@vtext.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@vzwpix.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Virgin Mobile</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@vmobl.com</td>
<td style="text-align: center; width: 275px;">[insert 10-digit number]@vmpix.com</td>
</tr>
<tr>
<td style="text-align: center; width: 145px;"><strong>Republic Wireless</strong></td>
<td style="text-align: center; width: 275px;">[insert 10-digital number]@text.republicwireless.com</td>
<td style="text-align: center; width: 275px;"></td>
</tr>
</tbody>
</table>

In [35]:
tgt_email = getpass.getpass('Full target email address: ')

Full target email address:  ········


In [36]:
yag.send(tgt_email, 'Test', "Testing 123")

{}

## Exercises

---

**1. Use yagmail to send matching links to your e-mail address, instead of writing to file.**

In [37]:
yag.send(tgt_email, 'Automation Results', results)

{}

**2. Use the table above to find your phone carrier, and send the matching links from Exercise III.3 to your phone.**

In [38]:
phone = getpass.getpass('Phone number (only digits): ')
domain = 'vzwpix.com'
yag.send(phone+'@'+domain, 'Check out this stuff on Craigslist', results)

Phone number (only digits):  ········


{}

## With Improvements

In [39]:
results=[]

with open('items.csv') as f:
    # For-loop over f.readlines(). Remember to indent!
    for line in f.readlines():
        
        # Each line needs to be split()
        line_list = line.split(',')
        description, category, minimum, maximum = line_list

        driver.get("https://newyork.craigslist.org")
        driver.find_element('link text', category).click()

        min_price = driver.find_element('css selector', '[placeholder="min"]')
        max_price = driver.find_element('css selector', '[placeholder="max"]')
        min_price.send_keys(minimum)
        max_price.send_keys(maximum)       
        
        distance = driver.find_element('css selector', '[placeholder="miles"]')
        postal = driver.find_element('css selector', '[name="postal"]')
        distance.send_keys("5")
        postal.send_keys("11238")
        
        
        search = driver.find_element('css selector', '[enterkeyhint="search"]')
        search.send_keys(description)
        driver.find_element('css selector', '.cl-exec-search').click()
        
        scam_words = ['rare', 'antique', 'pristine', 'sale!!']
        
        for link in driver.find_elements('css selector', ".posting-title")[:5]:
            scam = False
            for word in scam_words:
                if word in link.text.lower():
                    scam = True
                    print("SCAM!  " + link.text)
            if not scam:
                results.append(link.get_attribute("href"))
print('\n\n'.join([result for result in results]))

https://newyork.craigslist.org/mnh/fuo/d/new-york-side-table-coffee-table/7760998897.html

https://newyork.craigslist.org/mnh/fuo/d/new-york-wood-brass-coffee-table-tiers/7760053555.html

https://newyork.craigslist.org/brk/fuo/d/brooklyn-compact-circular-coffee-table/7745738381.html

https://newyork.craigslist.org/mnh/fuo/d/new-york-table-lampschairstablemetal/7761850556.html

https://newyork.craigslist.org/brk/fuo/d/brooklyn-obo-new-couch-table-bookshelf/7757360205.html

https://newyork.craigslist.org/brk/mcy/d/brooklyn-2013-triumph-bonneville-865cc/7761833878.html

https://newyork.craigslist.org/mnh/mcd/d/new-york-2013-triumph-thunderbird/7761793332.html

https://newyork.craigslist.org/mnh/mcd/d/new-york-2023-triumph-tiger-900-gt-pro/7761506471.html

https://newyork.craigslist.org/mnh/mcd/d/new-york-2022-triumph-street-triple-rs/7761506338.html

https://newyork.craigslist.org/mnh/mcd/d/new-york-2023-triumph-tiger-1200-gt-pro/7761506299.html
