## Part 1: Cleaning Books dataset
___
1. Remove columns we don't want.
2. Remove rows that have NaN values.
3. Remove rows with no image.
4. Keep rows that are english.
5. Keep first author only.
6. Get description column and page_count column by webscraping.


In [1]:
import os
import pandas as pd

In [2]:
# Load datasets
dataset_path = "../dataset/goodbooks-10k"
books_path = os.path.join(dataset_path, "books.csv")
tags_path = os.path.join(dataset_path, "tags.csv")
book_tags_path = os.path.join(dataset_path, "book_tags.csv")

tags_df = pd.read_csv(tags_path)
books_df = pd.read_csv(books_path)
book_tags_df = pd.read_csv(book_tags_path)

In [3]:
books_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 23 columns):
book_id                      10000 non-null int64
goodreads_book_id            10000 non-null int64
best_book_id                 10000 non-null int64
work_id                      10000 non-null int64
books_count                  10000 non-null int64
isbn                         9300 non-null object
isbn13                       9415 non-null float64
authors                      10000 non-null object
original_publication_year    9979 non-null float64
original_title               9415 non-null object
title                        10000 non-null object
language_code                8916 non-null object
average_rating               10000 non-null float64
ratings_count                10000 non-null int64
work_ratings_count           10000 non-null int64
work_text_reviews_count      10000 non-null int64
ratings_1                    10000 non-null int64
ratings_2                    10000 n

In [4]:
# 1. Remove columns we don't want
books_df.drop(columns=['best_book_id', 'work_id', 'books_count', 'isbn13', 'original_title', 'work_ratings_count', 'work_text_reviews_count'], inplace=True)
books_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 16 columns):
book_id                      10000 non-null int64
goodreads_book_id            10000 non-null int64
isbn                         9300 non-null object
authors                      10000 non-null object
original_publication_year    9979 non-null float64
title                        10000 non-null object
language_code                8916 non-null object
average_rating               10000 non-null float64
ratings_count                10000 non-null int64
ratings_1                    10000 non-null int64
ratings_2                    10000 non-null int64
ratings_3                    10000 non-null int64
ratings_4                    10000 non-null int64
ratings_5                    10000 non-null int64
image_url                    10000 non-null object
small_image_url              10000 non-null object
dtypes: float64(2), int64(8), object(6)
memory usage: 1.2+ MB


In [5]:
# 2. Remove rows with NaN values
books_df.dropna(inplace=True)
books_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 8243 entries, 0 to 9998
Data columns (total 16 columns):
book_id                      8243 non-null int64
goodreads_book_id            8243 non-null int64
isbn                         8243 non-null object
authors                      8243 non-null object
original_publication_year    8243 non-null float64
title                        8243 non-null object
language_code                8243 non-null object
average_rating               8243 non-null float64
ratings_count                8243 non-null int64
ratings_1                    8243 non-null int64
ratings_2                    8243 non-null int64
ratings_3                    8243 non-null int64
ratings_4                    8243 non-null int64
ratings_5                    8243 non-null int64
image_url                    8243 non-null object
small_image_url              8243 non-null object
dtypes: float64(2), int64(8), object(6)
memory usage: 1.1+ MB


In [6]:
# 3. Remove rows with no image for book
books_df = books_df[books_df['image_url'].str.contains('nophoto')]
books_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2683 entries, 32 to 9996
Data columns (total 16 columns):
book_id                      2683 non-null int64
goodreads_book_id            2683 non-null int64
isbn                         2683 non-null object
authors                      2683 non-null object
original_publication_year    2683 non-null float64
title                        2683 non-null object
language_code                2683 non-null object
average_rating               2683 non-null float64
ratings_count                2683 non-null int64
ratings_1                    2683 non-null int64
ratings_2                    2683 non-null int64
ratings_3                    2683 non-null int64
ratings_4                    2683 non-null int64
ratings_5                    2683 non-null int64
image_url                    2683 non-null object
small_image_url              2683 non-null object
dtypes: float64(2), int64(8), object(6)
memory usage: 356.3+ KB


In [7]:
# 4. Remove non-english books
books_df = books_df[books_df['language_code'].str.contains('en')]
books_df.drop(columns=['language_code'], inplace=True)
books_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2648 entries, 32 to 9996
Data columns (total 15 columns):
book_id                      2648 non-null int64
goodreads_book_id            2648 non-null int64
isbn                         2648 non-null object
authors                      2648 non-null object
original_publication_year    2648 non-null float64
title                        2648 non-null object
average_rating               2648 non-null float64
ratings_count                2648 non-null int64
ratings_1                    2648 non-null int64
ratings_2                    2648 non-null int64
ratings_3                    2648 non-null int64
ratings_4                    2648 non-null int64
ratings_5                    2648 non-null int64
image_url                    2648 non-null object
small_image_url              2648 non-null object
dtypes: float64(2), int64(8), object(5)
memory usage: 331.0+ KB


In [8]:
# 5. Keep name of first author only (second name usually illustrator or introduction by)
books_df['author'] = books_df['authors'].apply(lambda x:  x.split(',')[0])
books_df.drop(columns=['authors'], inplace=True)
books_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2648 entries, 32 to 9996
Data columns (total 15 columns):
book_id                      2648 non-null int64
goodreads_book_id            2648 non-null int64
isbn                         2648 non-null object
original_publication_year    2648 non-null float64
title                        2648 non-null object
average_rating               2648 non-null float64
ratings_count                2648 non-null int64
ratings_1                    2648 non-null int64
ratings_2                    2648 non-null int64
ratings_3                    2648 non-null int64
ratings_4                    2648 non-null int64
ratings_5                    2648 non-null int64
image_url                    2648 non-null object
small_image_url              2648 non-null object
author                       2648 non-null object
dtypes: float64(2), int64(8), object(5)
memory usage: 331.0+ KB


In [9]:
import re
from bs4 import BeautifulSoup
import requests
def clean_line(line):
    line = str(line)
    line = line.replace('<br/>', '\n\n')
    line = line.replace('<br>', '\n\n')
    line = line.replace('</br>', '')
    line = line.replace('\xa0', ' ')

    if 'https' in line:
        line =''

    line = re.sub('<.{0,6}>', '', line)
    line = re.sub('</.{0,6}>', '', line)
    return line

In [53]:
# 6. Web scrape for description and number of pages (takes hours)
from bs4 import BeautifulSoup
import requests
from tqdm import tqdm
descriptions = []
page_counts = []
books_to_remove = []

prefix = 'https://www.goodreads.com/book/show/'
for goodreads_book_id in tqdm(books_df['goodreads_book_id']):
    url = prefix + str(goodreads_book_id)
    
    # Get request
    while True: # So it tries again when request fails
        page = requests.get(url)
        if page.ok:
            break

    print(goodreads_book_id)
    soup = BeautifulSoup(page.content, 'html.parser')
    
    # Get page count
    nb_pages = soup.find("span", itemprop="numberOfPages")
    if nb_pages is None: # Not a book (probably audio recording / cassette)
        books_to_remove.append(goodreads_book_id)
        continue
    page_count = int(nb_pages.text.split()[0])
    page_counts.append(page_count)
    
    # Get description
    description = soup.find(id="description")
    if description is None:
        descriptions.append('No description available.')
    else:
        description_text = description.find_all('span')[-1]
        description_text = ''.join(clean_line(line) for line in description_text.children)
        while '\n ' in description_text:
            description_text = re.sub('\n ', '\n', description_text)
        description_text = re.sub('\n+', '\n\n', description_text)
        description_text = description_text.strip()
        descriptions.append(description_text)
    



  0%|          | 0/305 [00:00<?, ?it/s]6150530

  0%|          | 1/305 [00:03<17:28,  3.45s/it]198511

  1%|          | 2/305 [00:07<18:06,  3.59s/it]34941

  1%|          | 3/305 [00:12<19:52,  3.95s/it]18889

  1%|▏         | 4/305 [00:16<20:06,  4.01s/it]829313

  2%|▏         | 5/305 [00:20<19:47,  3.96s/it]2230284

  2%|▏         | 6/305 [00:26<23:41,  4.75s/it]132609

  2%|▏         | 7/305 [00:30<21:46,  4.39s/it]78674

  3%|▎         | 8/305 [00:36<24:30,  4.95s/it]13074

  3%|▎         | 9/305 [00:42<25:52,  5.25s/it]22304

  3%|▎         | 10/305 [00:48<26:39,  5.42s/it]12786

  4%|▎         | 11/305 [00:51<23:35,  4.81s/it]37619

  4%|▍         | 12/305 [00:55<22:00,  4.51s/it]4599

  4%|▍         | 13/305 [00:59<20:45,  4.26s/it]4609710

  5%|▍         | 14/305 [01:02<19:14,  3.97s/it]37298

  5%|▍         | 15/305 [01:07<20:14,  4.19s/it]228194

  5%|▌         | 16/305 [01:11<19:59,  4.15s/it]43345

  6%|▌         | 17/305 [01:14<18:51,  3.93s/it]32636

  6%|▌         | 1

In [56]:
temp_df = pd.DataFrame(descriptions, columns=['description'])
temp_df.to_csv('descriptions_complete.csv', index=False)

In [57]:
temp_df = pd.DataFrame(page_counts, columns=['page_count'])
temp_df.to_csv('page_count_complete.csv', index=False)

In [58]:
temp_df = pd.DataFrame(books_to_remove, columns=['goodread_book_id'])
temp_df.to_csv('books_to_remove_complete.csv', index=False)

In [72]:
# Remove books we don't want (they are probably audio)
books_df = books_df[~books_df['goodreads_book_id'].isin(books_to_remove)]

In [74]:
# Add description and page_count columns
books_df['description'] = descriptions
books_df['page_count'] = page_counts

In [75]:
books_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2647 entries, 32 to 9996
Data columns (total 17 columns):
book_id                      2647 non-null int64
goodreads_book_id            2647 non-null int64
isbn                         2647 non-null object
original_publication_year    2647 non-null float64
title                        2647 non-null object
average_rating               2647 non-null float64
ratings_count                2647 non-null int64
ratings_1                    2647 non-null int64
ratings_2                    2647 non-null int64
ratings_3                    2647 non-null int64
ratings_4                    2647 non-null int64
ratings_5                    2647 non-null int64
image_url                    2647 non-null object
small_image_url              2647 non-null object
author                       2647 non-null object
description                  2647 non-null object
page_count                   2647 non-null int64
dtypes: float64(2), int64(9), object(6)
memory us

In [81]:
books_df

Unnamed: 0,book_id,goodreads_book_id,isbn,original_publication_year,title,average_rating,ratings_count,ratings_1,ratings_2,ratings_3,ratings_4,ratings_5,image_url,small_image_url,author,description,page_count
32,33,930,739326228,1997.0,Memoirs of a Geisha,4.08,1300209,23500,59033,258700,517157,559782,https://s.gr-assets.com/assets/nophoto/book/11...,https://s.gr-assets.com/assets/nophoto/book/50...,Arthur Golden,"A literary sensation and runaway bestseller, t...",434
41,42,1934,451529308,1868.0,"Little Women (Little Women, #1)",4.04,1257121,31645,70011,250794,426280,535563,https://s.gr-assets.com/assets/nophoto/book/11...,https://s.gr-assets.com/assets/nophoto/book/50...,Louisa May Alcott,"Generations of readers young and old, male and...",449
62,63,6185,393978893,1847.0,Wuthering Heights,3.82,899195,46469,84084,215320,309180,346082,https://s.gr-assets.com/assets/nophoto/book/11...,https://s.gr-assets.com/assets/nophoto/book/50...,Emily Brontë,This best-selling Norton Critical Edition is b...,464
77,78,5139,307275558,2003.0,"The Devil Wears Prada (The Devil Wears Prada, #1)",3.70,665930,24231,58323,192366,226675,178250,https://s.gr-assets.com/assets/nophoto/book/11...,https://s.gr-assets.com/assets/nophoto/book/50...,Lauren Weisberger,A delightfully dishy novel about the all-time ...,432
85,86,32542,385338600,1989.0,A Time to Kill,4.03,597775,12106,25938,122675,218617,229488,https://s.gr-assets.com/assets/nophoto/book/11...,https://s.gr-assets.com/assets/nophoto/book/50...,John Grisham,Before The Firm and The Pelican Brief made him...,515
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9984,9985,19688,425176932,2000.0,"Breaking Point (Tom Clancy's Net Force, #4)",3.69,7693,268,684,2349,2456,2068,https://s.gr-assets.com/assets/nophoto/book/11...,https://s.gr-assets.com/assets/nophoto/book/50...,Steve Perry,"In the year 2000, computers are the new superp...",368
9986,9987,8087038,312651198,2010.0,"Chasing The Night (Eve Duncan, #11; Catherine ...",4.12,10129,113,331,2127,3957,4436,https://s.gr-assets.com/assets/nophoto/book/11...,https://s.gr-assets.com/assets/nophoto/book/50...,Iris Johansen,A CIA agent's two-year-old child was stolen in...,362
9987,9988,129237,674017722,1971.0,A Theory of Justice,3.91,8472,234,607,2001,3171,3095,https://s.gr-assets.com/assets/nophoto/book/11...,https://s.gr-assets.com/assets/nophoto/book/50...,John Rawls,"Since it appeared in 1971, John Rawls's A Theo...",824
9994,9995,15613,1416523723,1924.0,"Billy Budd, Sailor",3.09,10866,1478,2225,3805,2985,1617,https://s.gr-assets.com/assets/nophoto/book/11...,https://s.gr-assets.com/assets/nophoto/book/50...,Herman Melville,A handsome young sailor is unjustly accused of...,160


## Part 2: Cleaning Tags dataset
___
1. Need to redo `book_tags_cleaned.csv` so we use ISBN as the index column instead of book_id. Recall `book_tags.csv` has (goodreads_book_id, tag_id, count)
2. We can get mapping from book_id to isbn in `book_tags_cleaned.csv`.
3. Then we can replace `book_tags_cleaned.csv` from (book_id, tag_id) to (isbn, tag_id).
4. For books that don't have tags because they weren't in `books_cleaned.csv` we must add tags for them.

In [84]:
book_tags_cleaned_df = pd.read_csv(os.path.join(dataset_path, "book_tags_cleaned.csv"))
books_cleaned_df = pd.read_csv(os.path.join(dataset_path, "books_cleaned.csv"))

In [133]:
# Mapping from book_id to isbn
book_id_to_isbn = dict()
for _, row in books_cleaned_df[['book_id', 'isbn']].iterrows():
    book_id_to_isbn[row['book_id']] = row['isbn']

In [115]:
# Drop rows where book_id has no isbn
book_tags_cleaned_df = book_tags_cleaned_df[book_tags_cleaned_df['book_id'].isin(book_id_to_isbn.keys())]

In [117]:
# Add isbn column
book_tags_cleaned_df['isbn'] = book_tags_cleaned_df['book_id'].apply(lambda x: book_id_to_isbn[x])

In [121]:
# Drop book_id column
book_tags_cleaned_df.drop(columns=['book_id'], inplace=True)

In [130]:
# Get list of ISBN for books that do not have their tags in book_tags_cleaned
books_with_tags_in_book_tags = list(book_tags_cleaned_df['isbn'].unique())
books_with_tags_in_books = list(books_df['isbn'])

In [132]:
# ISBN of books in books but not in book_tags
isbn_of_books_with_missing_tags = set(books_with_tags_in_books) - set(books_with_tags_in_book_tags)

In [140]:
# Get goodreads book id of books with missing tags
goodreads_id_with_missing_tags = list(books_df[books_df['isbn'].isin(isbn_of_books_with_missing_tags)]['goodreads_book_id'])

In [164]:
# Get old tags of books with missing tags (in book_tags_cleaned) from book_tags.csv
goodreads_id_to_tags = book_tags_df[book_tags_df['goodreads_book_id'].isin(goodreads_id_with_missing_tags)].groupby('goodreads_book_id')['tag_id'].apply(list).to_dict()


In [147]:
# Convert old tags to new tags
tags_cleaned_df = pd.read_csv(os.path.join(dataset_path, "tags_cleaned.csv"))

# Mapping from tag name to id (in cleaned tags)
tag_name_to_id = dict()
for _, row in tags_cleaned_df[['tag_name', 'tag_id']].iterrows():
    tag_name_to_id[row['tag_name']] = row['tag_id']

In [210]:
# Get all the tag ids for books with missing tag ids (has atleast 10 occurrences)
from collections import defaultdict
all_tags = defaultdict(int)
for goodread_id, tags in goodreads_id_to_tags.items():
    for tag in tags:
        all_tags[tag] += 1

all_tags = [key for key, val in all_tags.items() if val > 10]


In [218]:
# old tag id to tag name
old_tag_id_to_new = dict()

for tag_id in all_tags:
    tag_name = tags_df[tags_df['tag_id'] == tag_id]['tag_name'].values[0]
    # Rename tag_name
    print('Old tag name:', tag_name)

    # If tag_name is in tag_name_to_id, we add (tag_id, tag_name_to_id[tag_name]) to a dict
    while True:
        print('input: ', end='')
        user_input = input()
        print()
        if user_input == '':
            break
        candidates = user_input.split(',')
        if set(candidates).intersection(tag_name_to_id.keys()) == set(candidates):
            print('you entered:', user_input)
            old_tag_id_to_new[tag_id] = [tag_name_to_id[c] for c in candidates]
            break
        print('not valid. Retry ', end=' ')




Old tag name:to-read
input:
Old tag name:favorites
input:
Old tag name:fantasy
input:
you entered:fantasy
Old tag name:currently-reading
input:
Old tag name:young-adult
input:
you entered:young-adult
Old tag name:fiction
input:
you entered:fiction
Old tag name:magic
input:
you entered:magic
Old tag name:owned
input:
Old tag name:series
input:
you entered:series
Old tag name:ya
input:
you entered:young-adult
Old tag name:books-i-own
input:
Old tag name:favourites
input:
Old tag name:adventure
input:
you entered:adventure
Old tag name:children
input:
you entered:children
Old tag name:children-s-books
input:
you entered:children
Old tag name:childrens
input:
you entered:children
Old tag name:childhood
input:
you entered:children
Old tag name:shelfari-favorites
input:
Old tag name:sci-fi-fantasy
input:
you entered:science-fiction,fantasy
Old tag name:classics
input:
you entered:classic
Old tag name:favorite-books
input:
Old tag name:novels
input:
you entered:novel
Old tag name:to-buy
input

In [233]:

# Mapping from goodreads_book_id to isbn
goodreads_book_id_to_isbn = dict()
for _, row in books_df[['goodreads_book_id', 'isbn']].iterrows():
    goodreads_book_id_to_isbn[row['goodreads_book_id']] = row['isbn']

#  Get new tag ids for books missing tags
isbn_id_to_tags_new = dict()
for goodreads_book_id, tags in goodreads_id_to_tags.items():
    isbn_id_to_tags_new[goodreads_book_id_to_isbn[goodreads_book_id]] = list(set([item for sublist in [old_tag_id_to_new[tag] for tag in tags if tag in old_tag_id_to_new] for item in sublist]))

In [240]:
new_tag_ids = []
new_isbn = []
for k, values in isbn_id_to_tags_new.items():
    for v in values:
        new_isbn.append(k)
        new_tag_ids.append(v)


In [241]:
additional_book_tags_df = pd.DataFrame(list(zip(new_tag_ids, new_isbn)), columns=['tag_id', 'isbn']) 
additional_book_tags_df

Unnamed: 0,tag_id,isbn
0,32,439682584
1,33,439682584
2,3,439682584
3,5,439682584
4,6,439682584
...,...,...
886,51,62415832
887,21,62415832
888,24,62415832
889,26,62415832


In [242]:
book_tags_cleaned_df

Unnamed: 0,tag_id,isbn
0,0,439785960
1,9,439785960
2,1,439785960
3,13,439785960
4,3,439785960
...,...,...
210087,14,62498533
210088,5,62498533
210089,10,62498533
210090,65,62498533


In [244]:
final_book_tags_cleaned_df = pd.concat([book_tags_cleaned_df, additional_book_tags_df])

In [245]:
final_book_tags_cleaned_df.to_csv('final_book_tags.csv', index=False)

In [249]:
books_df.drop(columns=['book_id', 'goodreads_book_id', 'small_image_url']).to_csv('final_books.csv', index=False)

In [252]:
tags_cleaned_df.to_csv('final_tags.csv', index=False)

In [253]:
# Sanity check
final_books = pd.read_csv('final_books.csv')
final_book_tags = pd.read_csv('final_book_tags.csv')
final_tags = pd.read_csv('final_tags.csv')

In [266]:
# One last thing. Lets get rid of tags like to-read, favourites and owned.
tag_id_to_remove = [0,1,2]  # to-read, favourites, owned
final_book_tags = final_book_tags[~final_book_tags['tag_id'].isin(tag_id_to_remove)]
final_tags = final_tags[~final_tags['tag_id'].isin(tag_id_to_remove)]

final_book_tags.to_csv('final_book_tags.csv', index=False)
final_tags.to_csv('final_tags.csv', index=False)

In [267]:
# Sanity check (again)
final_books = pd.read_csv('final_books.csv')
final_book_tags = pd.read_csv('final_book_tags.csv')
final_tags = pd.read_csv('final_tags.csv')

In [268]:
final_books

Unnamed: 0,isbn,original_publication_year,title,average_rating,ratings_count,ratings_1,ratings_2,ratings_3,ratings_4,ratings_5,image_url,author,description,page_count
0,739326228,1997.0,Memoirs of a Geisha,4.08,1300209,23500,59033,258700,517157,559782,https://s.gr-assets.com/assets/nophoto/book/11...,Arthur Golden,"A literary sensation and runaway bestseller, t...",434
1,451529308,1868.0,"Little Women (Little Women, #1)",4.04,1257121,31645,70011,250794,426280,535563,https://s.gr-assets.com/assets/nophoto/book/11...,Louisa May Alcott,"Generations of readers young and old, male and...",449
2,393978893,1847.0,Wuthering Heights,3.82,899195,46469,84084,215320,309180,346082,https://s.gr-assets.com/assets/nophoto/book/11...,Emily Brontë,This best-selling Norton Critical Edition is b...,464
3,307275558,2003.0,"The Devil Wears Prada (The Devil Wears Prada, #1)",3.70,665930,24231,58323,192366,226675,178250,https://s.gr-assets.com/assets/nophoto/book/11...,Lauren Weisberger,A delightfully dishy novel about the all-time ...,432
4,385338600,1989.0,A Time to Kill,4.03,597775,12106,25938,122675,218617,229488,https://s.gr-assets.com/assets/nophoto/book/11...,John Grisham,Before The Firm and The Pelican Brief made him...,515
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2642,425176932,2000.0,"Breaking Point (Tom Clancy's Net Force, #4)",3.69,7693,268,684,2349,2456,2068,https://s.gr-assets.com/assets/nophoto/book/11...,Steve Perry,"In the year 2000, computers are the new superp...",368
2643,312651198,2010.0,"Chasing The Night (Eve Duncan, #11; Catherine ...",4.12,10129,113,331,2127,3957,4436,https://s.gr-assets.com/assets/nophoto/book/11...,Iris Johansen,A CIA agent's two-year-old child was stolen in...,362
2644,674017722,1971.0,A Theory of Justice,3.91,8472,234,607,2001,3171,3095,https://s.gr-assets.com/assets/nophoto/book/11...,John Rawls,"Since it appeared in 1971, John Rawls's A Theo...",824
2645,1416523723,1924.0,"Billy Budd, Sailor",3.09,10866,1478,2225,3805,2985,1617,https://s.gr-assets.com/assets/nophoto/book/11...,Herman Melville,A handsome young sailor is unjustly accused of...,160


In [269]:
final_tags

Unnamed: 0,tag_name,tag_id
0,fiction,3
1,adult,4
2,novel,5
3,contemporary,6
4,series,7
...,...,...
137,aliens,140
138,dragons,141
139,police,142
140,smut,143


In [270]:
final_book_tags

Unnamed: 0,tag_id,isbn
0,9,439785960
1,13,439785960
2,3,439785960
3,7,439785960
4,32,439785960
...,...,...
161572,51,62415832
161573,21,62415832
161574,24,62415832
161575,26,62415832
