This code notebook contains the set of codes used to scrape the text data from the ['Catering' forum of SQTalk](http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering). The output is saved under the [`webscraped_data`](.../datasets/webscraped_data) folder as [`catering.csv`](.../datasets/webscraped_data/catering.csv).

In [1]:
# Import libraries
import requests # http library for python  # to get info from external links
from bs4 import BeautifulSoup # library for pulling data out of HTML
import re
import pandas as pd
from tqdm import tqdm
tqdm.pandas()
import time

In [3]:
%%time
# to automate ALL listings on the link programatically:
url_list = [] # declare empty list to append information from every post scraping

for num_results in range(1, 25): # keeps running until "for" condition breaks
    url = f'http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page{num_results}'
    print(url)
    
    response = requests.get(url)
    print('Status Code: ',response.status_code)
    
    soup = BeautifulSoup(response.text, "lxml")
    topics = soup.find_all('a', {'class':'topic-title js-topic-title'})
    topics_links = [topic['href'] for topic in topics]
       
    for link in tqdm(topics_links, desc='1st loop'):# loop over each post listing
        url2 = link
        response2 = requests.get(url2)
        soup2 = BeautifulSoup(response2.text, "lxml")
        try: page_total = int(soup2.find('span', {'class':'pagetotal'}).text)
        except: page_total = 1
        
        for page in tqdm(range(1,page_total+1), desc='2nd loop'):
            adjusted_url = f'{url2}/page{page}'
            url_list.append(adjusted_url)            

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page1
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 308095.79it/s][A
1st loop:   6%|██▏                               | 1/16 [00:13<03:18, 13.26s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 201983.04it/s][A
1st loop:  12%|████▎                             | 2/16 [00:16<01:42,  7.29s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 200036.04it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:20<01:15,  5.80s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 13486.51it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:28<01:22,  6.85s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 40820.48it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:33<01:08,  6.21s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 18052.96it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page2
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 301551.27it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:50,  3.35s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 388473.89it/s][A
1st loop:  12%|████▎                             | 2/16 [00:07<00:50,  3.58s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 189262.63it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:10<00:44,  3.42s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 30030.82it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:13<00:39,  3.30s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 27594.11it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:16<00:35,  3.19s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 10205.12it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page3
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|██████████████████████████████| 44/44 [00:00<00:00, 60867.21it/s][A
1st loop:   6%|██▏                               | 1/16 [00:04<01:00,  4.05s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 400052.84it/s][A
1st loop:  12%|████▎                             | 2/16 [00:07<00:49,  3.55s/it]
2nd loop: 100%|██████████████████████████████| 31/31 [00:00<00:00, 78849.86it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:10<00:43,  3.35s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 23003.50it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:15<00:48,  4.06s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 50686.45it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:19<00:46,  4.21s/it]
2nd loop: 100%|█████████████████████████████████| 3/3 [00:00<00:00, 7427.93it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page4
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 278775.49it/s][A
1st loop:   6%|██▏                               | 1/16 [00:11<02:52, 11.49s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 292413.35it/s][A
1st loop:  12%|████▎                             | 2/16 [00:27<03:16, 14.07s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 356228.56it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:30<01:57,  9.06s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 71291.29it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:35<01:28,  7.33s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 15694.31it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:54<02:08, 11.69s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 15947.92it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page5
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 419430.40it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:51,  3.40s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 489845.72it/s][A
1st loop:  12%|████▎                             | 2/16 [00:06<00:45,  3.22s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 269199.64it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:09<00:41,  3.16s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 74898.29it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:14<00:46,  3.84s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 25153.25it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:20<00:49,  4.48s/it]
2nd loop: 100%|█████████████████████████████████| 3/3 [00:00<00:00, 4847.04it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page6
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 103971.48it/s][A
1st loop:   6%|██▏                               | 1/16 [00:32<08:02, 32.14s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 313592.82it/s][A
1st loop:  12%|████▎                             | 2/16 [01:10<08:19, 35.65s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 208037.48it/s][A
1st loop:  19%|██████▍                           | 3/16 [01:42<07:20, 33.92s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 73156.47it/s][A
1st loop:  25%|████████▌                         | 4/16 [01:45<04:20, 21.73s/it]
2nd loop: 100%|█████████████████████████████████| 4/4 [00:00<00:00, 2969.42it/s][A
1st loop:  31%|██████████▋                       | 5/16 [01:48<02:44, 14.97s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 10034.22it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page7
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 443628.31it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:48,  3.26s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 460438.17it/s][A
1st loop:  12%|████▎                             | 2/16 [00:06<00:45,  3.23s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 199728.76it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:12<00:56,  4.32s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 14742.72it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:16<00:50,  4.21s/it]
2nd loop: 100%|█████████████████████████████████| 4/4 [00:00<00:00, 3781.21it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:21<00:49,  4.46s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 11325.75it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page8
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|██████████████████████████████| 44/44 [00:00<00:00, 40259.46it/s][A
1st loop:   6%|██▏                               | 1/16 [00:08<02:00,  8.01s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 581029.13it/s][A
1st loop:  12%|████▎                             | 2/16 [00:11<01:16,  5.50s/it]
2nd loop: 100%|███████████████████████████████| 31/31 [00:00<00:00, 3680.46it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:17<01:11,  5.49s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 56048.61it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:21<01:00,  5.07s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 16194.22it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:25<00:52,  4.79s/it]
2nd loop: 100%|█████████████████████████████████| 3/3 [00:00<00:00, 1347.06it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page9
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 118149.41it/s][A
1st loop:   6%|██▏                               | 1/16 [00:04<01:11,  4.74s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 205697.67it/s][A
1st loop:  12%|████▎                             | 2/16 [00:09<01:07,  4.79s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 104436.49it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:13<00:58,  4.50s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 23718.97it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:18<00:54,  4.54s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 13662.23it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:23<00:52,  4.75s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 18315.74it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page10
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 295279.00it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:54,  3.60s/it]
2nd loop: 100%|██████████████████████████████| 64/64 [00:00<00:00, 62689.27it/s][A
1st loop:  12%|████▎                             | 2/16 [00:06<00:46,  3.35s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 219634.16it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:10<00:44,  3.41s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 20610.83it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:13<00:41,  3.43s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 37365.74it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:16<00:36,  3.29s/it]
2nd loop: 100%|█████████████████████████████████| 3/3 [00:00<00:00, 8571.47it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page11
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 310167.02it/s][A
1st loop:   6%|██▏                               | 1/16 [00:04<01:01,  4.12s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 530504.85it/s][A
1st loop:  12%|████▎                             | 2/16 [00:07<00:52,  3.74s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 173364.57it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:11<00:49,  3.80s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 70492.50it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:14<00:42,  3.58s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 45466.71it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:17<00:37,  3.39s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 18504.28it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page12
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 280044.58it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:49,  3.27s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 481930.80it/s][A
1st loop:  12%|████▎                             | 2/16 [00:06<00:44,  3.19s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 121858.88it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:09<00:41,  3.22s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 71291.29it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:12<00:38,  3.22s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 11966.63it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:15<00:34,  3.17s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 42366.71it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page13
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 344308.54it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:51,  3.42s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 237763.91it/s][A
1st loop:  12%|████▎                             | 2/16 [00:06<00:45,  3.26s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 123949.88it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:09<00:42,  3.25s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 30916.25it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:12<00:38,  3.22s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 50686.45it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:16<00:35,  3.23s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 10791.52it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page14
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 459078.05it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:48,  3.24s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 329773.29it/s][A
1st loop:  12%|████▎                             | 2/16 [00:06<00:46,  3.30s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 351414.66it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:09<00:41,  3.21s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 82241.25it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:12<00:38,  3.19s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 40427.03it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:15<00:34,  3.10s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 10645.44it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page15
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 148113.46it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:49,  3.31s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 387912.51it/s][A
1st loop:  12%|████▎                             | 2/16 [00:26<03:29, 14.95s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 260567.98it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:37<02:52, 13.26s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 50131.12it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:40<01:50,  9.18s/it]
2nd loop: 100%|█████████████████████████████████| 4/4 [00:00<00:00, 3922.66it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:44<01:20,  7.28s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 35645.64it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page16
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 109395.01it/s][A
1st loop:   6%|██▏                               | 1/16 [00:19<04:48, 19.20s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 457300.61it/s][A
1st loop:  12%|████▎                             | 2/16 [00:32<03:38, 15.63s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 107635.28it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:35<02:09,  9.99s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 25758.26it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:39<01:30,  7.57s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 49932.19it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:42<01:06,  6.08s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 20971.52it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page17
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 265157.15it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:50,  3.39s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 485416.74it/s][A
1st loop:  12%|████▎                             | 2/16 [00:06<00:45,  3.28s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 292187.47it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:09<00:41,  3.19s/it]
2nd loop: 100%|█████████████████████████████████| 6/6 [00:00<00:00, 4600.70it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:12<00:38,  3.18s/it]
2nd loop: 100%|█████████████████████████████████| 4/4 [00:00<00:00, 1907.59it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:15<00:34,  3.13s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 11018.31it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page18
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 411023.11it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:55,  3.67s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 460438.17it/s][A
1st loop:  12%|████▎                             | 2/16 [00:06<00:46,  3.36s/it]
2nd loop: 100%|██████████████████████████████| 31/31 [00:00<00:00, 12885.09it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:10<00:43,  3.34s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 64693.63it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:13<00:40,  3.34s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 54120.05it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:16<00:36,  3.29s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 38014.84it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page19
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 295279.00it/s][A
1st loop:   6%|██▏                               | 1/16 [00:05<01:22,  5.50s/it]
2nd loop: 100%|██████████████████████████████| 64/64 [00:00<00:00, 23409.39it/s][A
1st loop:  12%|████▎                             | 2/16 [00:13<01:34,  6.72s/it]
2nd loop: 100%|██████████████████████████████| 31/31 [00:00<00:00, 70435.22it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:21<01:37,  7.48s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 58389.38it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:26<01:18,  6.57s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 42473.96it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:29<00:59,  5.37s/it]
2nd loop: 100%|█████████████████████████████████| 3/3 [00:00<00:00, 8949.44it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page20
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 217628.98it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:47,  3.19s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 249939.90it/s][A
1st loop:  12%|████▎                             | 2/16 [00:06<00:44,  3.20s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 202844.66it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:09<00:42,  3.26s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 47215.43it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:13<00:39,  3.33s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 38391.80it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:16<00:35,  3.23s/it]
2nd loop: 100%|█████████████████████████████████| 3/3 [00:00<00:00, 6123.07it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page21
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 307582.29it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:48,  3.27s/it]
2nd loop: 100%|██████████████████████████████| 64/64 [00:00<00:00, 10341.15it/s][A
1st loop:  12%|████▎                             | 2/16 [00:10<01:18,  5.58s/it]
2nd loop: 100%|██████████████████████████████| 31/31 [00:00<00:00, 15019.46it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:17<01:23,  6.42s/it]
2nd loop: 100%|█████████████████████████████████| 6/6 [00:00<00:00, 3267.86it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:22<01:06,  5.54s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 42153.81it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:29<01:08,  6.24s/it]
2nd loop: 100%|██████████████████████████████████| 3/3 [00:00<00:00, 898.52it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page22
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 375864.31it/s][A
1st loop:   6%|██▏                               | 1/16 [00:05<01:21,  5.40s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 521233.90it/s][A
1st loop:  12%|████▎                             | 2/16 [00:08<00:57,  4.13s/it]
2nd loop: 100%|██████████████████████████████| 31/31 [00:00<00:00, 49913.02it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:12<00:51,  3.93s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 76725.07it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:16<00:49,  4.13s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 54827.50it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:22<00:52,  4.73s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 25216.26it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page23
Status Code:  200


1st loop:   0%|                                          | 0/16 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 449025.25it/s][A
1st loop:   6%|██▏                               | 1/16 [00:03<00:49,  3.32s/it]
2nd loop: 100%|██████████████████████████████| 64/64 [00:00<00:00, 21585.35it/s][A
1st loop:  12%|████▎                             | 2/16 [00:06<00:44,  3.19s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 281435.98it/s][A
1st loop:  19%|██████▍                           | 3/16 [00:10<00:49,  3.78s/it]
2nd loop: 100%|████████████████████████████████| 6/6 [00:00<00:00, 61230.72it/s][A
1st loop:  25%|████████▌                         | 4/16 [00:14<00:42,  3.53s/it]
2nd loop: 100%|█████████████████████████████████| 4/4 [00:00<00:00, 7936.24it/s][A
1st loop:  31%|██████████▋                       | 5/16 [00:19<00:45,  4.13s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 39444.87it/s][A
1st loop: 

http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/page24
Status Code:  200


1st loop:   0%|                                          | 0/13 [00:00<?, ?it/s]
2nd loop: 100%|█████████████████████████████| 44/44 [00:00<00:00, 341758.10it/s][A
1st loop:   8%|██▌                               | 1/13 [00:05<01:08,  5.72s/it]
2nd loop: 100%|█████████████████████████████| 64/64 [00:00<00:00, 279620.27it/s][A
1st loop:  15%|█████▏                            | 2/13 [00:08<00:46,  4.20s/it]
2nd loop: 100%|█████████████████████████████| 31/31 [00:00<00:00, 319467.87it/s][A
1st loop:  23%|███████▊                          | 3/13 [00:11<00:37,  3.70s/it]
2nd loop: 100%|█████████████████████████████████| 6/6 [00:00<00:00, 3056.70it/s][A
1st loop:  31%|██████████▍                       | 4/13 [00:15<00:31,  3.47s/it]
2nd loop: 100%|████████████████████████████████| 4/4 [00:00<00:00, 40427.03it/s][A
1st loop:  38%|█████████████                     | 5/13 [00:18<00:26,  3.30s/it]
2nd loop: 100%|████████████████████████████████| 3/3 [00:00<00:00, 23258.62it/s][A
1st loop: 

CPU times: user 38.1 s, sys: 2.46 s, total: 40.6 s
Wall time: 26min 59s





In [6]:
# Drop duplicated urls due to sticky topics on the website
url_list_set = set(url_list)
len(url_list_set)

486

In [7]:
# Change the set back to a list
url_list = list(url_list_set)
len(url_list)

486

In [9]:
%%time
posts_catering = [] # declare empty list to append information from every post scraping
for i in tqdm(url_list, desc='3rd loop'):
    url3 = i
    response3 = requests.get(url3)
    soup3 = BeautifulSoup(response3.text, "lxml")
    title = soup3.find_all('h1', {'class':'main-title js-main-title hide-on-editmode'})
    title = title[1].text
    post_list = soup3.find_all('div', {'class':'b-post__body'})
        
    for j in tqdm(post_list, desc='4th loop'):
        post_dict = {}
        post_dict['title'] = title
        post_dict['link'] = i
        reply = j.find('div', {'class':'js-post__content-text'}).text
        reply = re.sub(r'\r|\n\t', ' ', reply)
        reply = re.sub('\s+', ' ', reply)
        post_dict['reply'] = reply
        posts_catering.append(post_dict)

3rd loop:   0%|                                         | 0/486 [00:00<?, ?it/s]
4th loop: 100%|███████████████████████████████| 15/15 [00:00<00:00, 2016.10it/s][A
3rd loop:   0%|                                 | 1/486 [00:03<27:18,  3.38s/it]
4th loop: 100%|█████████████████████████████████| 2/2 [00:00<00:00, 2222.15it/s][A
3rd loop:   0%|▏                                | 2/486 [00:05<19:16,  2.39s/it]
4th loop: 100%|███████████████████████████████| 15/15 [00:00<00:00, 5226.76it/s][A
3rd loop:   1%|▏                                | 3/486 [00:08<22:08,  2.75s/it]
4th loop: 100%|███████████████████████████████| 15/15 [00:00<00:00, 3124.33it/s][A
3rd loop:   1%|▎                                | 4/486 [00:11<23:19,  2.90s/it]
4th loop: 100%|███████████████████████████████| 11/11 [00:00<00:00, 2161.10it/s][A
3rd loop:   1%|▎                                | 5/486 [00:14<22:32,  2.81s/it]
4th loop: 100%|███████████████████████████████| 15/15 [00:00<00:00, 2395.83it/s][A
3rd loop: 

CPU times: user 56.2 s, sys: 3.03 s, total: 59.3 s
Wall time: 21min 13s





In [10]:
# Checked scraped data
posts_catering

[{'title': 'SQ First Menus',
  'link': 'http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/68-/page39',
  'reply': ' SQ961 CGK-SIN August 2014 dinner FROM JAKARTA TO SINGAPORE MAIN COURSES Beef Stik Malay Style Beef steak served with vegetables in coconut cream and steamed rice Baked herb crusted gindara cod on a bed of warm potato salad and cherry tomato Stir fried rice vermicelli with chicken, prawns and vegetables DESSERT Rhubarb crumble with vanilla sauce and mixed berries FRESH FRUIT A selection of fresh fruit FROM THE BAKERY Oven fresh rolls with a choice of extra virgin olive oil or butter Garlic bread HOT BEVERAGES A selection of gourmet coffees & fine teas '},
 {'title': 'SQ First Menus',
  'link': 'http://www.sqtalk.com/forum/forum/singapore-airlines/sq-catering-and-amenities/singapore-airlines-catering/68-/page39',
  'reply': ' LUNCH SQ 333 CDG-SIN MENU AUGUST 2014 CANAPES Satay With onion, cucumber and spicy peanut sau

In [12]:
# Put scraped data into a dataframe
df_catering = pd.DataFrame.from_dict(posts_catering)
print(df_catering.shape)
df_catering.head()

(5001, 3)


Unnamed: 0,title,link,reply
0,SQ First Menus,http://www.sqtalk.com/forum/forum/singapore-ai...,SQ961 CGK-SIN August 2014 dinner FROM JAKARTA...
1,SQ First Menus,http://www.sqtalk.com/forum/forum/singapore-ai...,LUNCH SQ 333 CDG-SIN MENU AUGUST 2014 CANAPES...
2,SQ First Menus,http://www.sqtalk.com/forum/forum/singapore-ai...,SQ833: PVG-SIN dinner menu SHANGHAI TO SINGAP...
3,SQ First Menus,http://www.sqtalk.com/forum/forum/singapore-ai...,hi does anyone have a recent sin-cdg menu in ...
4,SQ First Menus,http://www.sqtalk.com/forum/forum/singapore-ai...,SQ has added a new red wine to its First Clas...


In [13]:
# Save and export dataframe
df_catering.to_csv('.../datasets/webscraped_data/catering.csv', index=False)