# DDR Individual Project - Sujai Adithya Muralidharan
## Part 1: Scraping and Saving HTML Content

Importing the necessary libraries:

In [2]:
from bs4 import BeautifulSoup
import requests
import time
import os
import re

### 1. Identify the Target:
Start with navigating to the “free” section on the Craigslist San Francisco Bay Area site (https://sfbay.craigslist.org/search/zip Links to an external site.).

This page lists items that people are giving away for free.

### 2. Interact with the Page-Sorting:

Initially, the listings might be sorted by “newest” first.  Try changing the sorting order to “oldest” first by interacting with the page’s UI.

Observe any changes in the URL after you change the sorting order back and forth.

#### Can you trigger the sorting change directly by modifying only the URL in your browser’s address bar?  If so, how?
In the original link, if we add 'sort=dateoldest', then we can sort it by oldest first. This is shown as follows: https://sfbay.craigslist.org/search/zip?sort=dateoldest#search=1~gallery~0~0

If we set 'sort=date', then we sort it by newest first: https://sfbay.craigslist.org/search/zip?sort=date#search=1~gallery~0~0

#### Explain what type of request is made when you change the sort order (GET or POST).
When we change the sort order in the URL, we are making a GET request. We are making this request to retrieve data from the server based on the URL parameters. This is not a POST request because it does not involve sending any data to the server, like submitting login credentials or filling out a form. We are just displaying the existing data in a different way by tweaking the URL.

#### What is the variable in the URL associated with sorting?
The variable associated with sorting in this case is 'sort'. When we set the sort variable to 'dateoldest' we sort it by oldest first, and when we set it to 'date' we sort it by newest first.

### 3. Interact with the Page-Pagination:
Craigslist paginates listings, typically displaying a limited number of items per page (120 for me).

#### Navigate to the second and third pages of results and observe the changes in the URL.

Page 1: https://sfbay.craigslist.org/search/zip#search=1~gallery~0~0

Page 2: https://sfbay.craigslist.org/search/zip#search=1~gallery~1~0

Page 3: https://sfbay.craigslist.org/search/zip#search=1~gallery~2~0

#### Exploration Task: Determine how to move between pages by only changing the URL.  What part of the URL changes as you navigate through different pages? This task will help you understand how pagination works on Craigslist and how you can programmatically access different pages of listings.

In the link, if we change the number after 'gallery~' to the desired page number subtracted by 1, we can move between pages by only changing the URL. We subtract by 1 because the first page is indexed as 0.

#### Identify the variable associated with page changes.  How does altering this variable in the URL affect the page you’re viewing? Explain.

The variable associated with page changes here is the number after 'gallery~'. If we alter this variable in the URL, we can choose which page we want to view. Since the first page is indexed as zero, we need to set this variable to a number that is one less than our desired page. For example, if we want to see page 5, we set this number to 4. This sends a GET request to display page 4 in the browser.

### 4. Fetch Listing URLs:
Use `requests` to access the first page of the “free” section, ordered “newest” first.

Deploy `BeautifulSoup` to parse the HTML content.

In [2]:
headers = {'User-Agent' : 'Mozilla/5.0'}

url = 'https://sfbay.craigslist.org/search/zip?sort=date#search=1~gallery~0~0'
page = requests.get(url, headers)

soup = BeautifulSoup(page.content, 'html.parser')
print(soup.prettify())

<!DOCTYPE html>
<html>
 <head>
  <meta charset="utf-8"/>
  <meta content="IE=Edge" http-equiv="X-UA-Compatible"/>
  <meta content="width=device-width,initial-scale=1" name="viewport"/>
  <meta content="craigslist" property="og:site_name"/>
  <meta content="preview" name="twitter:card"/>
  <meta content="SF bay area free stuff - craigslist" property="og:title"/>
  <meta content="SF bay area free stuff - craigslist" name="description"/>
  <meta content="SF bay area free stuff - craigslist" property="og:description"/>
  <meta content="https://sfbay.craigslist.org/search/zip" property="og:url"/>
  <title>
   SF bay area free stuff - craigslist
  </title>
  <link href="https://sfbay.craigslist.org/search/zip" rel="canonical"/>
  <link href="https://sfbay.craigslist.org/search/zip" hreflang="x-default" rel="alternate"/>
  <link href="/favicon.ico" id="favicon" rel="icon">
   <script id="ld_searchpage_data" type="application/ld+json">
    {"breadcrumb":{"@context":"https://schema.org","@type"

#### Identify the structure that holds the links to individual listing pages.  What selector do you choose to grab the link?
The structure that holds the links to individual listing pages is the tage 'li' with the class 'cl-static-search-result'. To grab the link, we use soup.select('li.cl-static-search-result').find('a')['href'].

#### Can you identify one more possible selection method to retrieve the link to the individual listing?  Explain.
We have URLs in two parts of the the html. Both are under 'li.cl-static-search-result', however one is an anchor located inside 'div.gallery-inner' and the other is an anchor located under 'div.gallery-card'. We can specify either of them and extract the links using two methods.

#### Extract the first 250 unique listing URLs and save them to a list.  Consider the pagination feature of Craigslist to navigate through pages.  Explain your strategy. Print the list to screen.

Here, we are going to do the same and run a for loop to extract the links for 250 listings and saving it to a list. We are not worried about pagination because the html output when we print 'soup' is not the same as when we inspect the site in the browser. The html output that we get in the soup object contains more than 250 listings in itself, so we can directly extract them without considering pagination. The result has been printed below.

In [17]:
lists = []

links = soup.select('li.cl-static-search-result')

for l in links[:250]:
    link = l.find('a')['href']
    print(link)
    lists.append(link)

https://sfbay.craigslist.org/pen/zip/d/palo-alto-free-laundry-promotion/7715548665.html
https://sfbay.craigslist.org/sby/zip/d/saratoga-free-outdoor-standing/7715547249.html
https://sfbay.craigslist.org/sfc/zip/d/san-francisco-bookpattern-and-flow/7715547173.html
https://sfbay.craigslist.org/sby/zip/d/san-jose-chaise-lounge/7715546331.html
https://sfbay.craigslist.org/eby/zip/d/oakland-entertainment-unit-59x15x23/7715546023.html
https://sfbay.craigslist.org/nby/zip/d/santa-rosa-trampoline/7715545952.html
https://sfbay.craigslist.org/scz/zip/d/santa-cruz-free-speakers-and-klipsch-sub/7715545532.html
https://sfbay.craigslist.org/sfc/zip/d/san-francisco-free-glass-coffee-table/7715544591.html
https://sfbay.craigslist.org/nby/zip/d/healdsburg-free-crate-barrel-willow/7712753370.html
https://sfbay.craigslist.org/eby/zip/d/el-cerrito-books/7715542855.html
https://sfbay.craigslist.org/sfc/zip/d/san-francisco-build-your-own-birdhouse/7715542366.html
https://sfbay.craigslist.org/sfc/zip/d/san-f

#### 5. Save HTML Pages:
For each of the 250 listing URLs, use `requests` to fetch the listing page.

Save each HTML content to a separate file on disk.  Use each listing’s ID to organize files in a way that makes them easily identifiable (e.g., save listing ID 7713901653 to file “7713901653.html”).

In [26]:
x = os.getcwd()
print(x)

/Users/sujaiadithya/Desktop/DDR Project


Saving the page content of each of the HTML link to a local directory: /Users/sujaiadithya/Desktop/DDR Project/HTML_Links. Creating a regex pattern to extract the listing ID and setting that as the .html file name. A time gap of 5 seconds is set between each response.

In [47]:
folder = 'HTML_Links'
regex = r'\/([0-9]+)\.html$'

for url2 in lists:
    match = re.search(regex, url2)
    if match:
        ids = match.group(1)
        page2 = requests.get(url2)
        soup2 = BeautifulSoup(page2.content, 'html.parser')
        local_folder = os.path.join(folder, f"{ids}.html")
        f = open(local_folder, 'w', encoding='utf-8')
        f.write(soup2.prettify())
        f.close()
        time.sleep(5)

## Part 2: Parsing and Displaying Information from Saved HTML

### 1. Read Saved HTML Files:
Write a script that reads each of the saved HTML files from the disk.

### 2. Extract Information:
For each HTML file, use `BeautifulSoup` to parse the file content.

Extract and print the following details:

Title: The title of the listing.

URL of first image (if an image exists):  The URL of the displayed image.  It can be found in the `src` attribute of `<img>`

Description: The full description text of the listing.

Post ID: Usually found at the bottom of the page or within the page's HTML structure.

Posted Date: The date when the listing was originally posted.

Last Updated Date: The date when the listing was last updated.


Defining a function to read the saved HTML files and extract the required information as mentioned above in the question. For each required information, we are running an if loop to see if the necessary parameter exists and printing the information if it exists. In some cases, we are using .decompose() to remove certain tags under a parent class which are unnecessary to us.

In [149]:
def details(directory):
    for filename in os.listdir(directory):
        if filename.endswith(".html"):
            filepath = os.path.join(directory, filename)

            print("Listing ID:", os.path.splitext(filename)[0])

            with open(filepath, 'r', encoding='utf-8') as file:
                html = file.read()

            soup5 = BeautifulSoup(html, 'html.parser')

            print("")
            title = soup5.select_one('span.postingtitletext')

            if title:
                title_full = title.find_all('span')
                full_text = ""

                for i in title_full:
                    text = i.text.strip()
                    full_text = full_text + text + " "

                print("Title:", full_text.strip())
            else:
                print("No title")

            print("")
            image = soup5.find('img')

            if image:
                img_url = image.get('src')
                print("URL of image:", img_url)
            else:
                print("No image")

            print("")
            desc = soup5.select_one("section#postingbody")

            if desc:
                qr_code = desc.select_one("p.print-qrcode-label")
                if qr_code:
                    qr_code.decompose()
                description = desc.text.strip()
                print("Description:", description)
            else:
                print("No description")

            print("")
            post = soup5.select_one("div.postinginfos")

            if post:
                bs = post.select_one("a.bestof-link")
                if bs:
                    bs.decompose()
                bs1 = post.select_one("sup")
                if bs1:
                    bs1.decompose()
                print(post.text.strip())
            else:
                print("Post ID:", os.path.splitext(filename)[0])

            print("")
            print("")
            print("")

directory = 'HTML_Links'
details(directory)

Listing ID: 7715526266

Title: 3 drawer lateral file cabinet (san anselmo)

URL of image: https://images.craigslist.org/00Q0Q_5aDc88w6ZRt_0t20CI_600x450.jpg

Description: Giving away a lateral file cabinet in excellent condition. I dont have the keys, so it wont lock. other than that it works great. I am having a bunch of stuff hauled away on Saturday morning, so you need to pick it up before the haulers get there. Please dont ask if it is available, and don't tell me you want it if you are not going to show up.

post id: 7715526266
      

       posted:
       
        2024-02-07 16:39



Listing ID: 7714731668

Title: FREE TEDDY BEARS & GALAXY ROSES FOR VALENTINE'S DAY TO SPREAD LOVE (ingleside / SFSU / CCSF)

URL of image: https://images.craigslist.org/00Y0Y_8bQG1tUwXSq_0cI0oc_600x450.jpg

Description: FREE TEDDY BEARS, VALENTINE PLUSH TOYS & GALAXY ROSES FOR VALENTINE'S DAY TO SPREAD LOVE on GARAGE SALE FRIDAY FEB 9, SATURDAY FEB 10 AT AND SUNDAY FEB 11 AT 249 SAGAMORE STREET IN S


Title: Nursing school CD’s (Santa Cruz)

URL of image: https://images.craigslist.org/00g0g_xqpqng2aoy_0lM0CI_600x450.jpg

Description: These CD’s are from 2005-2007. They came with my skill books and textbooks. I don’t know if anyone uses CD’s anymore?

post id: 7705769410
      

       posted:
       
        2024-01-09 08:39
       


       updated:
       
        2024-02-07 17:05



Listing ID: 7715466412

Title: Wooden cabinet (inner sunset / UCSF)

URL of image: https://images.craigslist.org/00M0M_5elXOFuHGL6_0lM0t2_600x450.jpg

Description: Cabinet was used for T V and stereo.  Going out for recology pick up.  If you want it, please come take it. Sorry, no help or delivery, we are clearing out.
      
      Also putting out nice large leather chair and foot ottoman.
      
      It is also free for the taking.
      
      Recology coming tomorrow (thurs) so come today!
      
      1931 10th Avenue

post id: 7715466412
      

       posted:
       
        2024-02-07 13:30


URL of image: https://images.craigslist.org/00606_luEgAKNl0eu_0CI0ik_600x450.jpg

Description: 2 of this.
      
      Text me
      
      415 756 036nine.

post id: 7714717921
      

       posted:
       
        2024-02-05 10:48
       


       updated:
       
        2024-02-07 12:47



Listing ID: 7705495184

Title: Couch with pull out bed (Hayward)

URL of image: https://images.craigslist.org/00q0q_7b1LhAHpiYy_0CI0CI_600x450.jpg

Description: If ad is up it is still available.
      

      Very nice couch, but vinyl is flaking is corners.
      

      Non-smoker
      
      No pets
      

      Will remove post once gone

post id: 7705495184
      

       posted:
       
        2024-01-08 11:33
       


       updated:
       
        2024-02-07 10:38



Listing ID: 7710582593

Title: Free counter top (healdsburg / windsor)

URL of image: https://images.craigslist.org/00N0N_8AOqUYel9CC_0CI0t2_600x450.jpg

Description: Free counter top for bathroom, 6 feet by 22”. Call 

Title: Free sandbags (San Francisco)

URL of image: https://images.craigslist.org/00606_5XOCWSCzjek_0t20CI_600x450.jpg

Description: Free sandbags. Got for the storms, turns out we already had some.  I can leave outside if you want to come grab them.

post id: 7715459120
      

       posted:
       
        2024-02-07 13:12



Listing ID: 7715449570

Title: FREE desk chair with arms (walnut creek)

URL of image: https://images.craigslist.org/00X0X_aD4Kus0rgBJ_0lM0t2_600x450.jpg

Description: FREE  Captain's chair with arms, comfortable cushions upholstered in tan micro-fiber fabric. Adjustable seat and back, on casters. No animals in residence. 42"H, 27"W, 20"D. Normal wear, no tears or stains. Plan to move chair down two flights of stairs, I'm injured and can't carry anything. FREE

post id: 7715449570
      

       posted:
       
        2024-02-07 12:49



Listing ID: 7711928339

Title: SOFAS! Beige Sofa for free - 3 pieces (menlo park)

URL of image: https://images.craigslist.o


Title: Moving ***Free*** Household Items & Furniture (San Francisco)

URL of image: https://images.craigslist.org/00s0s_3bOVBOobVKs_0lM0CI_600x450.jpg

Description: 150 Valencia (Between Market & Duboce)
      

      If posting is still up item is here.
      

      No time unfortunately to respond to any inquiries.    First posted items are a lamp and MCM  hilt in Headboard.
      

      More to come ….

post id: 7715525602
      

       posted:
       
        2024-02-07 16:37
       


       updated:
       
        2024-02-07 16:55



Listing ID: 7715363491

Title: Floor plastic water proof sheet (Oakland)

URL of image: https://images.craigslist.org/00u0u_4ra2903ypgQ_0t20CI_600x450.jpg

Description: Probably almost a whole sheet. Only did our camper truck

post id: 7715363491
      

       posted:
       
        2024-02-07 09:31



Listing ID: 7715403901

Title: Free latch hook rug yarn (Danville)

URL of image: https://images.craigslist.org/00z0z_ck7wVKIbAN3_0CI0t2_600x45

URL of image: https://images.craigslist.org/00r0r_981rQ2BDVyY_0lM0t2_600x450.jpg

Description: Smart cat touchpad
      
      16th Ave between Geary and Clement

post id: 7714689535
      

       posted:
       
        2024-02-05 09:48
       


       updated:
       
        2024-02-07 15:34



Listing ID: 7715535360

Title: IKEA MYDAL bunk bed (palo alto)

No image

Description: I have a disassembled Mydal bunk bed from IKEA. This was in use from 2015-2024 and shows signs of use such as discoloration of some of  the wood. It is made of solid wood, so would be very easy to sand and make as good as new again.
      

      Bunk bed is currently partially disassembled and would be easy to transport in a truck. If you need to transport it in a smaller vehicle such as an SUV, I would be happy to further disassemble it.

post id: 7715535360
      

       posted:
       
        2024-02-07 17:13
       


       updated:
       
        2024-02-07 17:14



Listing ID: 7715527131

Title

Description: Clean just washed queen size comforter and 2 pillow cases. Free for the taking

post id: 7715505302
      

       posted:
       
        2024-02-07 15:27



Listing ID: 7715411920

No title

No image

No description

Post ID: 7715411920



Listing ID: 7715438964

Title: FREE Bucket Buddy (Tool holder for 5 gal bucket) (menlo park)

URL of image: https://images.craigslist.org/01111_fJdUYtIaBg1_07K0ak_600x450.jpg

Description: FREE 5 gallon bucket fitted with a multi-pocketed tool holder,  The canvas material has a few tears where it folds over the top of the bucket, but it is fully functional.

post id: 7715438964
      

       posted:
       
        2024-02-07 12:21



Listing ID: 7709864918

Title: Free Interior Wood doors (coastside/pescadero)

URL of image: https://images.craigslist.org/00Y0Y_lNLNfPa1xXp_0fu0kE_600x450.jpg

Description: Free Interior Wood doors
      

      Slab style door, with hardware and casings, ready to mount.   Pulled them out of a remodel.



Title: Free outdoor table and chairs (Petaluma)

URL of image: https://images.craigslist.org/00V0V_j4MJrPluPZm_0CI0t2_600x450.jpg

Description: Free table. Seats need replaced but the chairs are solid and they rock. I’ll take down the add when they are gone.

post id: 7715524280
      

       posted:
       
        2024-02-07 16:32



Listing ID: 7715470479

Title: Large sheet of tempered glass (richmond / point / annex)

URL of image: https://images.craigslist.org/00Z0Z_8e16B1G9p18_0t20CI_600x450.jpg

Description: A sheet of 46" x 76" x 3/16" tempered glass. It would make a great windscreen or use it to build a green house. It has silicone sealer stuck to edges from its previous use in a sun room.

post id: 7715470479
      

       posted:
       
        2024-02-07 13:42



Listing ID: 7715360612

Title: BBQ coal and plastic container (Oakland)

URL of image: https://images.craigslist.org/00404_6QLfZB984TR_0t20CI_600x450.jpg

Description: Good dry need gone

post id: 7715360612
 

post id: 7715440534
      

       posted:
       
        2024-02-07 12:25



Listing ID: 7715444062

Title: FREE Lazy Boy sectional couch (santa cruz)

URL of image: https://images.craigslist.org/00v0v_5buiJN5y8YZ_0ak07K_600x450.jpg

Description: 10 foot by 8.5 foot.
      
      Paid about 4500 8 years ago.
      
      It has 4 power recliners that function correctly.
      
      It looks like leather but it's not real leather and the finish is wearing off,
      
      Shound be fine as is or with a cover ever better.
      
      Bring a truck or trailer.
      
      Free, first come gets it.
      
      Need gone asap ordered a new one.
      




      Bedroom, sofa, couch, sleep, recliner,

post id: 7715444062
      

       posted:
       
        2024-02-07 12:34
       


       updated:
       
        2024-02-07 13:26



Listing ID: 7715305986

Title: Raft - Rapid Rider (Capitola)

URL of image: https://images.craigslist.org/00202_d6C7XKr0MpW_0CI0t2_600x450.jpg

Descri

## Part 3: Automating Login on The Old Reader

### 1. Creating and Verifying a The Old Reader Account
Account Creation:  Create an account on https://theoldreader.com Links to an external site..  Use an email address and password that you are comfortable sharing with us.

Manual Login Verification: Before automating the login process, ensure you can manually log in to theoldreader.com with your new credentials.  This confirms that your account is active and your credentials are correct.

--- Account has been successfully created and verified.

### 2. Exploring the Login Mechanism
Navigate to the login page of https://theoldreader.com Links to an external site.

Use your browser’s developer tools to inspect the page, focusing on the `form` tag involved in the login process.

#### Document all `input` fields within the login form, paying special attention to their name attributes. These fields are crucial for submitting the login request programmatically.

input name="authenticity_token"

input name="user[login]"

input name="user[password]"

input name="commit"


### 3. Analyzing Network Traffic for Login Request
With the network tab of your browser’s developer tools open, log in to the site again.

#### Identify the network request made when you submit the login form (GET or POST). Explain why this method was chosen.
This is a POST request because we are sending data to the server. We are sending the data necessary to login to the link. This method was chosen because we are in need to input our login credentials. If we don't have to input anything and just retrieve existing data, it would be a GET request.

In [3]:
headers = {'User-Agent' : 'Mozilla/5.0'}

url = 'https://theoldreader.com/users/sign_in'
page = requests.get(url, headers)

soup = BeautifulSoup(page.content, 'html.parser')
print(soup.prettify())

<!DOCTYPE html>
<html>
 <head>
  <meta charset="utf-8"/>
  <meta content="IE=edge" http-equiv="X-UA-Compatible"/>
  <meta content="width=device-width, initial-scale=1" name="viewport"/>
  <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
  <link href="https://fonts.googleapis.com/css?family=Montserrat:400,600" rel="stylesheet" type="text/css"/>
  <!-- Latest compiled and minified JavaScript -->
  <script src="https://code.jquery.com/jquery.js">
  </script>
  <script src="//netdna.bootstrapcdn.com/bootstrap/3.0.1/js/bootstrap.min.js">
  </script>
  <link href="https://fonts.googleapis.com/css?family=Source+Code+Pro" rel="stylesheet" type="text/css"/>
  <link href="https://fonts.googleapis.com/css?family=Open+Sans:400,800" rel="stylesheet"/>
  <link href="//s.theoldreader.com/assets/reader/public-c7869a909c7b119a27fb646003828344.css" media="screen" rel="stylesheet" type="text/css">
   <link href="//s.theoldreader.com/assets/

In [4]:
form = soup.select_one('#new_user')

auth_input = form.select_one('input[name=authenticity_token]')
auth = auth_input.get('value')

print(auth)

6JL+UI7logWKvhs54wc9u3GTeGIC1fhhqjz0FYpO7ak=


#### Carefully examine the payload that was submitted to the server during login. Compare this payload to the `form` / `input` fields you previously analyzed. Explain your observation.
We can see that the payload has all the fields that we identified in the input tags. Here, we copy the values in the payload for 'user[login]', 'user[password]' and 'commit' and input it in the session.post(data=) code.

For the authenticity token, since it could be dynamic we are going to input the 'auth' variable from the previous line of code.

### 4. Automating the Login Process
Using Python and appropriate libraries like requests, simulate the login process.

Create a session object to maintain your login state across multiple requests.

Prepare a payload with your login credentials and other necessary form data identified from the login page and the network analysis.

Send a POST request to the login form’s action URL to log in, using the session object.

In [5]:
time.sleep(5)

session = requests.session()

session.post("https://theoldreader.com/users/sign_in", data = {'authenticity_token' : auth,
                                                              'user[login]' : 'sujaidt98@gmail.com',
                                                              'user[password]' : 'thesecret',
                                                               'commit' : 'Sign In'},
                                                              timeout=20)
cookies = session.cookies.get_dict()
print(cookies)

{'_new_reader_session': 'BAh7CkkiD3Nlc3Npb25faWQGOgZFVEkiJWZiNmVjYTc5YjA2MGEzMmE0NTA5ZDgyNmIzMmRiNWQwBjsAVEkiGXdhcmRlbi51c2VyLnVzZXIua2V5BjsAVFsHWwZVOhpNb3BlZDo6QlNPTjo6T2JqZWN0SWQiEWVlyBCW7exUkzjCDkkiIiQyYSQwNSQ3Rmw1eFkuWDE5bHdiTXBYR3l6eDdPBjsAVEkiDWxhbmd1YWdlBjsARjoHZW5JIhByZWRpcmVjdF90bwY7AEZJIgYvBjsARkkiEF9jc3JmX3Rva2VuBjsARkkiMWtYTGd1OUJ4cXRGNTZDL0gzVWVHS3NyVnd3UGg0M0xIczVpM2lKOHA5bkk9BjsARg%3D%3D--3b3b523387b5a77bad630f5d16d7a6e32630572b', 'i_know_you': 'Sujai+Adithya', 'remember_user_token': 'BAhbB1sGVToaTW9wZWQ6OkJTT046Ok9iamVjdElkIhFlZcgQlu3sVJM4wg5JIiIkMmEkMDUkN0ZsNXhZLlgxOWx3Yk1wWEd5eng3TwY6BkVU--febd6901858f7066c59d3947cdf18878a87370c7', 'signed_at': '1707414622'}


### 5. Verifying Successful Login
After attempting to log in, inspect the cookies saved in the session object to understand the information The Old Reader stores on your computer.

Use the session object to access https://theoldreader.com Links to an external site..

Verify successful login by checking for the presence of your user information that is only available when logged in.

In [9]:
time.sleep(5)

url2 = 'https://theoldreader.com'
page2 = session.get(url2, cookies=cookies)

soup2 = BeautifulSoup(page2.content, 'html.parser')
print(soup2.prettify())

<!DOCTYPE html>
<html>
 <head>
  <meta charset="utf-8"/>
  <meta content="IE=edge" http-equiv="X-UA-Compatible"/>
  <meta content="width=device-width, initial-scale=1" name="viewport"/>
  <!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
  <link href="//s.theoldreader.com/assets/application-befb06d5a14978388154b51422cef437.css" media="all" rel="stylesheet" type="text/css"/>
  <link href="//s.theoldreader.com/assets/apple-touch-icon-57x57-86fe1176e14af4907a6fecfe5ca7e3f1.png" rel="apple-touch-icon-precomposed" sizes="57x57"/>
  <link href="//s.theoldreader.com/assets/apple-touch-icon-114x114-bae89acc41c93261dd962ea6ade08d22.png" rel="apple-touch-icon-precomposed" sizes="114x114"/>
  <link href="//s.theoldreader.com/assets/apple-touch-icon-72x72-f248503edfa3676f8d58af531aff7e88.png" rel="apple-touch-icon-precomposed" sizes="72x72"/>
  <link href="//s.theoldreader.com/assets/apple-touch-icon-144x144-510415291cae9b46a9ca4ac398

In [10]:
verify = soup2.select_one('li.dropdown')
print(verify.text.strip())

Sujai Adithya  

Settings
Manage Settings
Manage Account
Manage Subscriptions
View Profile

Help
Product Tour
Support

Sign Out


Login has been successfully verified.