### Purpose:
The purpose of this notebook is to reaquire text content used in Correlation Study using the URLs we have

### Environment:
This notebook uses the `soup` enironment because it needs Beautiful Soup and Pandas to run.

### Dependancies:

In [96]:
import pandas as pd
import requests
from bs4 import BeautifulSoup
import re
import os
import csv

### Functions:

In [101]:
def get_url_response(url):
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}

    try:
        response = requests.get(url, headers=headers, timeout=5)
    except requests.exceptions.Timeout as e:
        # Handle timeout error
        response=''
        print(f"Timeout occurred while requesting {url}: {e}")
    except requests.exceptions.RequestException as e:
        # Handle other request errors
        response=''
        print(f"Request error occurred while requesting {url}: {e}")
    return response

In [103]:
def save_url_responses(links, directory):
    # Check if directory exists, create it if it doesn't
    if not os.path.exists(directory):
        os.makedirs(directory)

    # Iterate through each link in the DataFrame
    for i, link in enumerate(links):
        # Print current record number being requested
        print(f"Requesting record {i+1}...")

        try:
            # Get the response for the link
            response = get_url_response(link)

            if response:
                # Parse HTML and find all text content
                soup = BeautifulSoup(response.content, 'html.parser')
                text = soup.get_text()

                # Save the text content to a CSV file in the specified directory
                filename = f"{i+1:05d}.csv"
                filepath = os.path.join(directory, filename)
                with open(filepath, "w", encoding="utf-8") as f:
                    writer = csv.writer(f)
                    writer.writerow(["Link", "Text"])
                    writer.writerow([link, text])

        except requests.exceptions.SSLError as e:
            # Handle SSL error
            print(f"SSL error occurred while requesting {link}: {e}")
            
        except requests.exceptions.RequestException as e:
            # Handle other request errors
            print(f"Request error occurred while requesting {link}: {e}")

In [None]:
def clean_newlines(text):
    # Replace all newlines with a space
    text = re.sub("\n", " ", text)

    # Replace all double spaces with a single space
    text = re.sub("  +", " ", text)

    return text

### Execution Code:

In [104]:
api_key = os.environ.get('X_OAI_API_KEY')

In [105]:
df = pd.read_csv('data/data_clean_final.csv')

In [106]:
df[df['link']=='https://www.familyhandyman.com/project/how-to-finish-concrete/']

Unnamed: 0,kw,rank,link,success,word_count,percent_human,percent_ai,uid,Adwords bottom,Adwords sitelink,...,Local pack,Local teaser,People also ask,Shopping results,Sitelinks,Thumbnail,Top stories,Tweet box,Video preview,Videos
0,how to finish concrete,1,https://www.familyhandyman.com/project/how-to-...,True,1689.0,99.926081,0.073917,how to finish concrete_1_https://www.familyhan...,0,0,...,0,0,1,0,0,0,0,0,0,1


In [107]:
links = df['link'].to_list()
directory = "data/responses"
save_url_responses(links, directory)

Requesting record 1...
Requesting record 2...
Requesting record 3...
Requesting record 4...
Requesting record 5...
Timeout occurred while requesting https://www.hunker.com/13402242/how-to-finish-concrete-floors: HTTPSConnectionPool(host='www.hunker.com', port=443): Read timed out. (read timeout=5)
Requesting record 6...
Timeout occurred while requesting https://www.ehow.com/way_5571195_inexpensive-way-finish-concrete-floors.html: HTTPSConnectionPool(host='www.ehow.com', port=443): Read timed out. (read timeout=5)
Requesting record 7...
Requesting record 8...
Requesting record 9...
Requesting record 10...
Requesting record 11...
Requesting record 12...
Requesting record 13...
Requesting record 14...
Requesting record 15...
Requesting record 16...
Requesting record 17...
Requesting record 18...
Requesting record 19...
Requesting record 20...
Requesting record 21...
Requesting record 22...
Requesting record 23...
Requesting record 24...
Requesting record 25...
Requesting record 26...
Requ

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 829...
Requesting record 830...
Requesting record 831...
Requesting record 832...
Requesting record 833...
Requesting record 834...
Requesting record 835...
Requesting record 836...
Requesting record 837...
Requesting record 838...
Requesting record 839...
Requesting record 840...
Requesting record 841...
Requesting record 842...
Requesting record 843...
Requesting record 844...
Requesting record 845...
Requesting record 846...
Requesting record 847...
Requesting record 848...
Requesting record 849...
Requesting record 850...
Requesting record 851...
Requesting record 852...
Requesting record 853...
Requesting record 854...
Requesting record 855...
Requesting record 856...
Requesting record 857...
Requesting record 858...
Requesting record 859...
Requesting record 860...
Requesting record 861...
Requesting record 862...
Requesting record 863...
Requesting record 864...
Requesting record 865...
Requesting record 866...
Requesting record 867...
Requesting record 868...


Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 1290...
Requesting record 1291...
Requesting record 1292...
Requesting record 1293...
Requesting record 1294...
Requesting record 1295...
Requesting record 1296...
Requesting record 1297...
Requesting record 1298...
Requesting record 1299...
Requesting record 1300...
Requesting record 1301...
Requesting record 1302...
Requesting record 1303...
Requesting record 1304...
Requesting record 1305...
Requesting record 1306...
Requesting record 1307...
Requesting record 1308...
Requesting record 1309...
Requesting record 1310...
Requesting record 1311...
Requesting record 1312...
Requesting record 1313...
Requesting record 1314...
Requesting record 1315...
Requesting record 1316...
Requesting record 1317...
Requesting record 1318...
Requesting record 1319...
Timeout occurred while requesting https://www.houstonmethodist.org/blog/articles/2021/sep/is-getting-too-much-sleep-bad-for-you/: HTTPSConnectionPool(host='www.houstonmethodist.org', port=443): Read timed out. (read time

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 2327...
Requesting record 2328...
Requesting record 2329...
Requesting record 2330...
Requesting record 2331...
Requesting record 2332...
Requesting record 2333...
Requesting record 2334...
Requesting record 2335...
Requesting record 2336...
Requesting record 2337...
Requesting record 2338...
Requesting record 2339...
Requesting record 2340...
Requesting record 2341...
Requesting record 2342...
Requesting record 2343...
Requesting record 2344...
Requesting record 2345...
Requesting record 2346...
Requesting record 2347...
Requesting record 2348...
Requesting record 2349...
Requesting record 2350...
Requesting record 2351...
Timeout occurred while requesting https://share.upmc.com/2014/06/when-to-seek-care-for-stomaches/: HTTPSConnectionPool(host='share.upmc.com', port=443): Read timed out. (read timeout=5)
Requesting record 2352...
Requesting record 2353...
Requesting record 2354...
Requesting record 2355...
Requesting record 2356...
Requesting record 2357...
Requesti

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 2546...
Requesting record 2547...
Requesting record 2548...
Requesting record 2549...
Requesting record 2550...
Requesting record 2551...
Requesting record 2552...
Requesting record 2553...
Requesting record 2554...
Requesting record 2555...
Requesting record 2556...
Requesting record 2557...
Requesting record 2558...
Requesting record 2559...
Requesting record 2560...
Requesting record 2561...
Requesting record 2562...
Requesting record 2563...
Requesting record 2564...
Requesting record 2565...
Requesting record 2566...
Requesting record 2567...
Requesting record 2568...
Requesting record 2569...
Requesting record 2570...
Requesting record 2571...
Requesting record 2572...
Requesting record 2573...
Requesting record 2574...
Requesting record 2575...
Requesting record 2576...
Requesting record 2577...
Requesting record 2578...
Requesting record 2579...
Requesting record 2580...
Requesting record 2581...
Requesting record 2582...
Requesting record 2583...


Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 2584...
Requesting record 2585...
Timeout occurred while requesting https://www.ehow.com/list_6677904_uniform-building-code-stairs-california.html: HTTPSConnectionPool(host='www.ehow.com', port=443): Read timed out. (read timeout=5)
Requesting record 2586...
Requesting record 2587...
Requesting record 2588...
Requesting record 2589...
Requesting record 2590...
Requesting record 2591...
Requesting record 2592...
Requesting record 2593...
Requesting record 2594...
Requesting record 2595...
Requesting record 2596...
Requesting record 2597...
Requesting record 2598...
Requesting record 2599...
Requesting record 2600...
Requesting record 2601...
Requesting record 2602...
Requesting record 2603...
Requesting record 2604...
Requesting record 2605...
Requesting record 2606...


Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 2607...
Requesting record 2608...
Requesting record 2609...
Requesting record 2610...
Requesting record 2611...
Requesting record 2612...
Requesting record 2613...
Requesting record 2614...
Requesting record 2615...
Requesting record 2616...
Requesting record 2617...
Requesting record 2618...
Requesting record 2619...
Requesting record 2620...
Requesting record 2621...
Requesting record 2622...
Requesting record 2623...
Requesting record 2624...
Requesting record 2625...
Requesting record 2626...
Requesting record 2627...
Requesting record 2628...
Requesting record 2629...
Requesting record 2630...
Requesting record 2631...
Requesting record 2632...
Requesting record 2633...
Requesting record 2634...
Requesting record 2635...
Requesting record 2636...
Requesting record 2637...
Requesting record 2638...
Requesting record 2639...
Requesting record 2640...
Requesting record 2641...
Requesting record 2642...
Requesting record 2643...
Requesting record 2644...
Requesting r

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 3287...
Requesting record 3288...
Requesting record 3289...
Requesting record 3290...
Requesting record 3291...
Requesting record 3292...
Requesting record 3293...
Requesting record 3294...
Requesting record 3295...
Requesting record 3296...
Requesting record 3297...
Requesting record 3298...
Requesting record 3299...
Requesting record 3300...
Requesting record 3301...
Requesting record 3302...
Requesting record 3303...
Requesting record 3304...
Requesting record 3305...
Requesting record 3306...
Requesting record 3307...
Requesting record 3308...
Requesting record 3309...
Requesting record 3310...
Requesting record 3311...
Requesting record 3312...
Requesting record 3313...
Requesting record 3314...
Requesting record 3315...
Requesting record 3316...
Requesting record 3317...
Requesting record 3318...
Requesting record 3319...
Requesting record 3320...
Requesting record 3321...
Requesting record 3322...
Requesting record 3323...
Requesting record 3324...
Requesting r

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 3434...
Requesting record 3435...
Requesting record 3436...
Requesting record 3437...
Requesting record 3438...
Requesting record 3439...
Timeout occurred while requesting https://alwaysfits.com/products/play-your-number-trivia-card-game: HTTPSConnectionPool(host='alwaysfits.com', port=443): Max retries exceeded with url: /products/play-your-number-trivia-card-game (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7feae3fe8f10>, 'Connection to alwaysfits.com timed out. (connect timeout=5)'))
Requesting record 3440...
Requesting record 3441...
Requesting record 3442...
Requesting record 3443...
Requesting record 3444...
Requesting record 3445...
Requesting record 3446...
Requesting record 3447...
Requesting record 3448...
Requesting record 3449...
Requesting record 3450...
Requesting record 3451...
Requesting record 3452...
Requesting record 3453...
Requesting record 3454...
Requesting record 3455...
Requesting record 3456...
Requesting rec

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 3806...
Requesting record 3807...
Requesting record 3808...
Requesting record 3809...
Requesting record 3810...
Requesting record 3811...
Requesting record 3812...
Requesting record 3813...
Requesting record 3814...
Requesting record 3815...
Requesting record 3816...
Requesting record 3817...
Requesting record 3818...
Requesting record 3819...
Requesting record 3820...
Requesting record 3821...
Requesting record 3822...
Requesting record 3823...
Requesting record 3824...
Requesting record 3825...
Requesting record 3826...
Requesting record 3827...
Requesting record 3828...
Requesting record 3829...
Requesting record 3830...
Requesting record 3831...
Requesting record 3832...
Requesting record 3833...
Requesting record 3834...
Requesting record 3835...
Requesting record 3836...
Requesting record 3837...
Requesting record 3838...
Requesting record 3839...
Requesting record 3840...
Requesting record 3841...
Requesting record 3842...
Requesting record 3843...
Requesting r

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 4104...
Requesting record 4105...
Requesting record 4106...
Requesting record 4107...
Requesting record 4108...
Requesting record 4109...
Requesting record 4110...
Requesting record 4111...
Requesting record 4112...
Requesting record 4113...
Requesting record 4114...
Requesting record 4115...
Requesting record 4116...
Requesting record 4117...
Requesting record 4118...
Requesting record 4119...
Requesting record 4120...
Requesting record 4121...
Requesting record 4122...
Requesting record 4123...
Requesting record 4124...
Requesting record 4125...
Requesting record 4126...
Requesting record 4127...
Requesting record 4128...
Requesting record 4129...
Requesting record 4130...
Requesting record 4131...
Requesting record 4132...
Requesting record 4133...
Requesting record 4134...
Requesting record 4135...
Requesting record 4136...
Requesting record 4137...
Request error occurred while requesting https://www.imageforweeds.com/all-products/kills-crabgrass: HTTPSConnectionP

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 4601...
Requesting record 4602...
Requesting record 4603...
Requesting record 4604...
Requesting record 4605...
Requesting record 4606...
Requesting record 4607...
Requesting record 4608...
Requesting record 4609...
Requesting record 4610...
Requesting record 4611...
Requesting record 4612...
Requesting record 4613...
Requesting record 4614...
Requesting record 4615...
Requesting record 4616...
Requesting record 4617...
Requesting record 4618...
Requesting record 4619...
Requesting record 4620...
Requesting record 4621...
Requesting record 4622...
Requesting record 4623...
Requesting record 4624...
Requesting record 4625...
Requesting record 4626...
Requesting record 4627...
Requesting record 4628...
Requesting record 4629...
Requesting record 4630...
Requesting record 4631...
Requesting record 4632...
Requesting record 4633...
Requesting record 4634...
Requesting record 4635...
Requesting record 4636...
Requesting record 4637...
Requesting record 4638...
Requesting r

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 5168...
Requesting record 5169...
Requesting record 5170...
Requesting record 5171...
Requesting record 5172...
Requesting record 5173...
Requesting record 5174...
Requesting record 5175...
Requesting record 5176...
Requesting record 5177...
Requesting record 5178...
Requesting record 5179...
Requesting record 5180...
Requesting record 5181...
Requesting record 5182...
Requesting record 5183...
Requesting record 5184...
Requesting record 5185...
Requesting record 5186...
Requesting record 5187...
Requesting record 5188...
Requesting record 5189...
Requesting record 5190...
Requesting record 5191...
Requesting record 5192...
Requesting record 5193...
Requesting record 5194...
Requesting record 5195...
Requesting record 5196...
Requesting record 5197...
Requesting record 5198...
Requesting record 5199...
Requesting record 5200...
Requesting record 5201...
Requesting record 5202...
Requesting record 5203...
Requesting record 5204...
Requesting record 5205...
Requesting r

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 6148...
Requesting record 6149...
Requesting record 6150...
Requesting record 6151...
Requesting record 6152...
Request error occurred while requesting https://www.hoteleffie.com/dining/ovide: HTTPSConnectionPool(host='www.hoteleffie.com', port=443): Read timed out.
Requesting record 6153...
Requesting record 6154...
Requesting record 6155...
Requesting record 6156...
Requesting record 6157...
Requesting record 6158...
Timeout occurred while requesting https://www.washingtonpost.com/people/shira-ovide/: HTTPSConnectionPool(host='www.washingtonpost.com', port=443): Read timed out. (read timeout=5)
Requesting record 6159...
Requesting record 6160...
Requesting record 6161...
Requesting record 6162...
Requesting record 6163...
Requesting record 6164...
Requesting record 6165...
Requesting record 6166...
Requesting record 6167...
Requesting record 6168...
Timeout occurred while requesting https://www.rhs.org.uk/plants/hydrangea/pruning-guide: HTTPSConnectionPool(host='www

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 6298...
Requesting record 6299...
Requesting record 6300...
Requesting record 6301...
Requesting record 6302...
Requesting record 6303...
Requesting record 6304...
Requesting record 6305...
Requesting record 6306...
Requesting record 6307...
Requesting record 6308...
Requesting record 6309...
Requesting record 6310...
Requesting record 6311...
Requesting record 6312...
Requesting record 6313...
Requesting record 6314...
Requesting record 6315...
Requesting record 6316...
Requesting record 6317...
Requesting record 6318...
Requesting record 6319...
Requesting record 6320...
Requesting record 6321...
Requesting record 6322...
Requesting record 6323...
Requesting record 6324...
Requesting record 6325...
Requesting record 6326...
Requesting record 6327...
Requesting record 6328...
Requesting record 6329...
Requesting record 6330...
Requesting record 6331...
Requesting record 6332...
Requesting record 6333...
Requesting record 6334...
Requesting record 6335...
Requesting r

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 6409...
Requesting record 6410...
Requesting record 6411...
Requesting record 6412...
Requesting record 6413...
Requesting record 6414...
Requesting record 6415...
Requesting record 6416...
Requesting record 6417...
Requesting record 6418...
Requesting record 6419...
Requesting record 6420...
Requesting record 6421...
Requesting record 6422...
Requesting record 6423...
Requesting record 6424...
Requesting record 6425...
Requesting record 6426...
Requesting record 6427...
Requesting record 6428...
Requesting record 6429...
Requesting record 6430...
Requesting record 6431...
Requesting record 6432...
Requesting record 6433...
Requesting record 6434...
Timeout occurred while requesting https://hearinghealthcenter.com/ask-the-audiologist/itchy-ears/: HTTPSConnectionPool(host='hearinghealthcenter.com', port=443): Read timed out. (read timeout=5)
Requesting record 6435...
Requesting record 6436...
Requesting record 6437...
Requesting record 6438...
Requesting record 6439...

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 6624...
Requesting record 6625...
Requesting record 6626...
Requesting record 6627...
Requesting record 6628...
Requesting record 6629...
Requesting record 6630...
Requesting record 6631...
Requesting record 6632...
Requesting record 6633...
Requesting record 6634...
Requesting record 6635...
Requesting record 6636...
Requesting record 6637...
Requesting record 6638...
Requesting record 6639...
Requesting record 6640...
Requesting record 6641...
Requesting record 6642...
Requesting record 6643...
Requesting record 6644...
Requesting record 6645...
Requesting record 6646...
Requesting record 6647...
Requesting record 6648...
Requesting record 6649...
Requesting record 6650...
Requesting record 6651...
Requesting record 6652...
Requesting record 6653...
Requesting record 6654...
Requesting record 6655...
Requesting record 6656...
Requesting record 6657...
Requesting record 6658...
Requesting record 6659...
Requesting record 6660...
Requesting record 6661...
Requesting r

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 6982...
Requesting record 6983...
Requesting record 6984...
Requesting record 6985...
Requesting record 6986...
Requesting record 6987...
Requesting record 6988...
Requesting record 6989...
Requesting record 6990...
Requesting record 6991...
Requesting record 6992...
Requesting record 6993...
Requesting record 6994...
Requesting record 6995...
Requesting record 6996...
Requesting record 6997...
Requesting record 6998...
Requesting record 6999...
Requesting record 7000...
Requesting record 7001...
Requesting record 7002...
Requesting record 7003...
Requesting record 7004...
Requesting record 7005...
Requesting record 7006...
Requesting record 7007...
Requesting record 7008...
Requesting record 7009...
Requesting record 7010...
Requesting record 7011...
Requesting record 7012...
Requesting record 7013...
Requesting record 7014...
Requesting record 7015...
Requesting record 7016...
Requesting record 7017...
Requesting record 7018...
Requesting record 7019...
Requesting r

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 8527...
Requesting record 8528...
Requesting record 8529...
Requesting record 8530...
Requesting record 8531...
Requesting record 8532...
Requesting record 8533...
Requesting record 8534...
Requesting record 8535...
Requesting record 8536...
Requesting record 8537...
Requesting record 8538...
Requesting record 8539...
Requesting record 8540...
Requesting record 8541...
Requesting record 8542...
Requesting record 8543...
Requesting record 8544...
Requesting record 8545...
Requesting record 8546...
Requesting record 8547...
Requesting record 8548...
Requesting record 8549...
Requesting record 8550...
Requesting record 8551...
Requesting record 8552...
Requesting record 8553...
Requesting record 8554...
Requesting record 8555...
Requesting record 8556...
Requesting record 8557...
Requesting record 8558...
Requesting record 8559...
Requesting record 8560...
Requesting record 8561...
Requesting record 8562...
Requesting record 8563...
Requesting record 8564...
Requesting r

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 9185...
Request error occurred while requesting http://people.stern.nyu.edu/adamodar/pc/datasets/invphil/highpastepsgrowth.xls: HTTPSConnectionPool(host='people.stern.nyu.edu', port=443): Max retries exceeded with url: /adamodar/pc/datasets/invphil/highpastepsgrowth.xls (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)')))
Requesting record 9186...
Requesting record 9187...
Requesting record 9188...
Requesting record 9189...
Requesting record 9190...
Requesting record 9191...
Requesting record 9192...
Requesting record 9193...
Requesting record 9194...
Requesting record 9195...
Requesting record 9196...
Requesting record 9197...
Requesting record 9198...
Requesting record 9199...
Requesting record 9200...
Requesting record 9201...
Requesting record 9202...
Requesting record 9203...
Requesting record 9204...
Requesting record 9205...
Requesting record 9206...


Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 9801...
Requesting record 9802...
Requesting record 9803...
Requesting record 9804...
Requesting record 9805...
Requesting record 9806...
Requesting record 9807...
Requesting record 9808...
Requesting record 9809...
Requesting record 9810...
Requesting record 9811...
Requesting record 9812...
Requesting record 9813...
Requesting record 9814...
Requesting record 9815...
Requesting record 9816...
Requesting record 9817...
Requesting record 9818...
Requesting record 9819...
Requesting record 9820...
Requesting record 9821...
Requesting record 9822...
Requesting record 9823...
Requesting record 9824...
Requesting record 9825...
Requesting record 9826...
Requesting record 9827...
Requesting record 9828...
Requesting record 9829...
Requesting record 9830...
Requesting record 9831...
Requesting record 9832...
Requesting record 9833...
Requesting record 9834...
Requesting record 9835...
Requesting record 9836...
Requesting record 9837...
Requesting record 9838...
Requesting r

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 9948...
Requesting record 9949...
Requesting record 9950...
Requesting record 9951...
Requesting record 9952...
Requesting record 9953...
Requesting record 9954...
Requesting record 9955...
Requesting record 9956...
Requesting record 9957...
Requesting record 9958...
Requesting record 9959...
Requesting record 9960...
Requesting record 9961...
Requesting record 9962...
Requesting record 9963...
Requesting record 9964...
Requesting record 9965...
Requesting record 9966...
Requesting record 9967...
Requesting record 9968...
Requesting record 9969...
Requesting record 9970...
Timeout occurred while requesting https://www.ehow.com/info_10058645_standard-height-kitchen-electrical-outlets.html: HTTPSConnectionPool(host='www.ehow.com', port=443): Read timed out. (read timeout=5)
Requesting record 9971...
Timeout occurred while requesting https://www.hunker.com/13414377/what-is-the-average-height-of-an-electrical-outlet-in-a-basement: HTTPSConnectionPool(host='www.hunker.com'

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 9978...
Requesting record 9979...
Requesting record 9980...
Requesting record 9981...
Requesting record 9982...
Requesting record 9983...
Requesting record 9984...
Requesting record 9985...


Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 9986...
Requesting record 9987...
Requesting record 9988...
Requesting record 9989...
Requesting record 9990...
Requesting record 9991...
Requesting record 9992...
Requesting record 9993...
Requesting record 9994...
Requesting record 9995...
Requesting record 9996...
Requesting record 9997...
Requesting record 9998...
Requesting record 9999...
Requesting record 10000...
Requesting record 10001...
Requesting record 10002...
Requesting record 10003...
Requesting record 10004...
Requesting record 10005...
Requesting record 10006...
Requesting record 10007...
Requesting record 10008...
Requesting record 10009...
Requesting record 10010...
Requesting record 10011...
Requesting record 10012...
Requesting record 10013...
Requesting record 10014...
Timeout occurred while requesting https://www.ahs.com/home-matters/repair-maintenance/how-to-clean-your-dryer/: HTTPSConnectionPool(host='www.ahs.com', port=443): Read timed out. (read timeout=5)
Requesting record 10015...
Requesti

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 11169...
Requesting record 11170...
Requesting record 11171...
Requesting record 11172...
Requesting record 11173...
Requesting record 11174...
Requesting record 11175...
Requesting record 11176...
Requesting record 11177...
Requesting record 11178...
Requesting record 11179...
Requesting record 11180...
Requesting record 11181...
Requesting record 11182...
Requesting record 11183...
Requesting record 11184...
Requesting record 11185...
Requesting record 11186...
Requesting record 11187...
Requesting record 11188...
Requesting record 11189...
Requesting record 11190...
Requesting record 11191...
Requesting record 11192...
Requesting record 11193...
Requesting record 11194...
Requesting record 11195...
Requesting record 11196...
Requesting record 11197...
Requesting record 11198...
Requesting record 11199...
Requesting record 11200...
Requesting record 11201...
Requesting record 11202...
Requesting record 11203...
Requesting record 11204...
Requesting record 11205...
R

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 11582...
Requesting record 11583...
Requesting record 11584...
Requesting record 11585...
Requesting record 11586...
Requesting record 11587...
Requesting record 11588...
Requesting record 11589...
Requesting record 11590...
Requesting record 11591...
Requesting record 11592...
Requesting record 11593...
Requesting record 11594...
Requesting record 11595...
Requesting record 11596...
Requesting record 11597...
Requesting record 11598...
Requesting record 11599...
Requesting record 11600...
Requesting record 11601...
Requesting record 11602...
Requesting record 11603...
Requesting record 11604...
Requesting record 11605...
Requesting record 11606...
Requesting record 11607...
Requesting record 11608...
Requesting record 11609...
Requesting record 11610...
Requesting record 11611...
Requesting record 11612...
Requesting record 11613...
Requesting record 11614...
Requesting record 11615...
Requesting record 11616...
Requesting record 11617...
Requesting record 11618...
R

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 12169...
Requesting record 12170...
Requesting record 12171...
Requesting record 12172...
Requesting record 12173...
Requesting record 12174...
Requesting record 12175...
Requesting record 12176...
Requesting record 12177...
Requesting record 12178...
Timeout occurred while requesting https://wallboardtrim.com/what-is-greenboard-drywall/: HTTPSConnectionPool(host='wallboardtrim.com', port=443): Read timed out. (read timeout=5)
Requesting record 12179...
Requesting record 12180...
Requesting record 12181...
Requesting record 12182...
Requesting record 12183...
Requesting record 12184...
Timeout occurred while requesting https://www.ehow.com/info_8597242_green-board-vs-cement-board.html: HTTPSConnectionPool(host='www.ehow.com', port=443): Read timed out. (read timeout=5)
Requesting record 12185...
Requesting record 12186...
Requesting record 12187...
Requesting record 12188...
Requesting record 12189...
Requesting record 12190...
Requesting record 12191...
Requesting re

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 12392...
Requesting record 12393...
Requesting record 12394...
Requesting record 12395...
Requesting record 12396...
Requesting record 12397...
Requesting record 12398...
Requesting record 12399...
Requesting record 12400...
Requesting record 12401...
Requesting record 12402...
Requesting record 12403...
Requesting record 12404...
Requesting record 12405...
Requesting record 12406...
Requesting record 12407...
Requesting record 12408...
Requesting record 12409...
Requesting record 12410...
Requesting record 12411...
Requesting record 12412...
Requesting record 12413...
Requesting record 12414...
Requesting record 12415...
Requesting record 12416...
Requesting record 12417...
Requesting record 12418...
Requesting record 12419...
Requesting record 12420...
Requesting record 12421...
Requesting record 12422...
Requesting record 12423...
Request error occurred while requesting https://www.plumbingsupply.com/how-to-replace-toilet-flappers.html: HTTPSConnectionPool(host='ww

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 12612...
Requesting record 12613...
Requesting record 12614...
Requesting record 12615...
Requesting record 12616...
Requesting record 12617...
Requesting record 12618...
Requesting record 12619...
Requesting record 12620...
Requesting record 12621...
Requesting record 12622...
Requesting record 12623...
Requesting record 12624...
Requesting record 12625...
Requesting record 12626...
Requesting record 12627...
Requesting record 12628...
Requesting record 12629...
Requesting record 12630...
Requesting record 12631...
Requesting record 12632...
Requesting record 12633...
Requesting record 12634...
Requesting record 12635...
Requesting record 12636...
Requesting record 12637...
Requesting record 12638...
Requesting record 12639...
Requesting record 12640...
Requesting record 12641...
Requesting record 12642...
Requesting record 12643...
Requesting record 12644...
Requesting record 12645...
Requesting record 12646...
Requesting record 12647...
Requesting record 12648...
R

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 12886...
Requesting record 12887...
Requesting record 12888...
Requesting record 12889...
Requesting record 12890...
Requesting record 12891...
Requesting record 12892...
Requesting record 12893...
Requesting record 12894...
Requesting record 12895...
Requesting record 12896...
Requesting record 12897...
Requesting record 12898...
Requesting record 12899...
Requesting record 12900...
Requesting record 12901...
Requesting record 12902...
Requesting record 12903...
Requesting record 12904...
Requesting record 12905...
Requesting record 12906...
Timeout occurred while requesting https://backyardpoultry.iamcountryside.com/coops/chicken-nesting-box-ideas/: HTTPSConnectionPool(host='backyardpoultry.iamcountryside.com', port=443): Read timed out. (read timeout=5)
Requesting record 12907...
Requesting record 12908...
Requesting record 12909...
Requesting record 12910...
Requesting record 12911...
Requesting record 12912...
Requesting record 12913...
Requesting record 12914...

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 13175...
Requesting record 13176...
Requesting record 13177...
Requesting record 13178...
Requesting record 13179...
Requesting record 13180...
Requesting record 13181...
Requesting record 13182...
Requesting record 13183...
Requesting record 13184...
Requesting record 13185...
Requesting record 13186...
Requesting record 13187...
Requesting record 13188...
Requesting record 13189...
Requesting record 13190...
Requesting record 13191...
Requesting record 13192...
Requesting record 13193...
Requesting record 13194...
Requesting record 13195...
Requesting record 13196...
Requesting record 13197...
Request error occurred while requesting https://www.northcreeknurseries.com/index.cfm/fuseaction/mobile.plant/ID/732/index.htm: HTTPSConnectionPool(host='www.northcreeknurseries.com', port=443): Max retries exceeded with url: /index.cfm/fuseaction/mobile.plant/ID/732/index.htm (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify 

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 13199...
Requesting record 13200...
Requesting record 13201...
Requesting record 13202...
Requesting record 13203...
Requesting record 13204...
Requesting record 13205...
Requesting record 13206...
Requesting record 13207...
Requesting record 13208...
Requesting record 13209...
Requesting record 13210...
Requesting record 13211...
Requesting record 13212...
Requesting record 13213...
Requesting record 13214...
Requesting record 13215...
Requesting record 13216...
Requesting record 13217...
Requesting record 13218...
Requesting record 13219...
Requesting record 13220...
Requesting record 13221...
Requesting record 13222...
Requesting record 13223...
Requesting record 13224...
Requesting record 13225...
Requesting record 13226...
Requesting record 13227...
Requesting record 13228...
Requesting record 13229...
Requesting record 13230...
Requesting record 13231...
Requesting record 13232...
Requesting record 13233...
Requesting record 13234...
Requesting record 13235...
R

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 13771...
Requesting record 13772...
Requesting record 13773...
Requesting record 13774...
Requesting record 13775...
Requesting record 13776...
Requesting record 13777...
Requesting record 13778...
Requesting record 13779...
Requesting record 13780...
Requesting record 13781...
Requesting record 13782...
Requesting record 13783...
Requesting record 13784...
Requesting record 13785...
Requesting record 13786...
Requesting record 13787...
Requesting record 13788...
Requesting record 13789...
Requesting record 13790...
Requesting record 13791...
Requesting record 13792...
Requesting record 13793...
Requesting record 13794...
Requesting record 13795...
Requesting record 13796...
Requesting record 13797...
Requesting record 13798...
Requesting record 13799...
Requesting record 13800...
Requesting record 13801...
Requesting record 13802...
Requesting record 13803...
Requesting record 13804...
Requesting record 13805...
Requesting record 13806...
Requesting record 13807...
R

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 14090...
Requesting record 14091...
Requesting record 14092...
Requesting record 14093...
Requesting record 14094...
Requesting record 14095...
Requesting record 14096...
Requesting record 14097...
Requesting record 14098...
Requesting record 14099...
Requesting record 14100...
Requesting record 14101...
Requesting record 14102...
Requesting record 14103...
Requesting record 14104...
Requesting record 14105...
Requesting record 14106...
Requesting record 14107...
Requesting record 14108...
Requesting record 14109...
Requesting record 14110...
Requesting record 14111...
Requesting record 14112...
Requesting record 14113...
Requesting record 14114...
Requesting record 14115...
Requesting record 14116...
Requesting record 14117...
Requesting record 14118...
Requesting record 14119...
Requesting record 14120...
Requesting record 14121...
Requesting record 14122...
Requesting record 14123...
Requesting record 14124...
Requesting record 14125...
Requesting record 14126...
R

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Requesting record 14131...
Requesting record 14132...
Requesting record 14133...
Requesting record 14134...
Requesting record 14135...
Requesting record 14136...
Requesting record 14137...
Requesting record 14138...
Requesting record 14139...
Requesting record 14140...
Requesting record 14141...
Requesting record 14142...
Requesting record 14143...
Requesting record 14144...
Requesting record 14145...
Requesting record 14146...
Requesting record 14147...
Requesting record 14148...
Requesting record 14149...
Requesting record 14150...
Requesting record 14151...
Requesting record 14152...
Requesting record 14153...
Requesting record 14154...
Requesting record 14155...
Requesting record 14156...
Requesting record 14157...
Requesting record 14158...
Requesting record 14159...
Requesting record 14160...
Requesting record 14161...
Requesting record 14162...
Requesting record 14163...
Requesting record 14164...
Requesting record 14165...
Requesting record 14166...
Requesting record 14167...
R

In [108]:
# how to get one record

In [109]:
url = 'https://www.target.com/c/table-lamps-lighting-home-decor/-/N-56d7t'

content = get_url_response(url)
print(content)

A bedroom table lamp sets the mood for your bedroom and illuminates reading and task work. These should be 24"–27" tall for optimal bedtime reading. Light your living room and create an inviting atmosphere with the right table lamp. The ideal lamp size for a living room table lamp is 24"–34" tall. Task desk lighting can help bring specific areas into focus, while a taller lamp can brighten a larger area. Desk lamps can range in size from 12"–30" high. These small (less than 24" tall) lamps soften the overall light in a room and draw attention to objects nearby. They’re perfect for entryways or living rooms. Add warmth & light to any space with a table lamp. As easy addition, lamps help set the right ambiance for every type of room. Living rooms require soft lighting to create an inviting atmosphere for your guests. Accent lamps not only brighten up the room but also double up as decor for your living room. The base and shade complement each other making them a great decor accessory. Th

In [110]:
type(content)

str