# Web Scraping Project - www.Hanna Andersson.com

I have chosen to practise my scraping skills on a US website that mainly sells pajamas: 
https://www.hannaandersson.com/

## Technical Requirements

The technical requirements for this project are as follows:

** You must clean and normalize your database.
* You must have at least 200 rows and 8 columns 9in the final clean database. More data is always welcome.


## Necessary Deliverables

The following deliverables should be pushed to your **Github repo** for this chapter.
* The result should be stored in **CSV format and SQL format. 
* A **Jupyter Notebook (.ipynb) file** that contains the code used to get the data. 
* An **output folder** containing the outputs of your API and scraping efforts.
* A **`README.md` file** containing a detailed explanation of your approach and code for retrieving data from the API and scraping the web page as well as your results, obstacles encountered, and lessons learned.

## Presentation

You will have **7 minutes** to present your project to the class and then **3 minutes** for Q&A,
so keep it simple!

The slides of your presentation must include the content listed below:
- Title of the project + Student name
- Description of your idea and project
- Challenges
- Process
- Learnings
- If I were to start from scratch...
- Improvements
- Highlights


## Suggested Ways to Get Started

* **Define a problem** - think what exactly you are willing to study. Prices on Black Friday? Biggest discounts?  Select your topic based on your points of interest and search for websites that contain some useful information.
* **Commit early, commit often**, don’t be afraid of doing something incorrectly because you can always roll back to a previous version.
* **Consult documentation and resources provided** to better understand the tools you are using and how to accomplish what you want.


## Useful Resources

* [Requests Library Documentation: Quickstart](http://docs.python-requests.org/en/master/user/quickstart/)
* [Requests library](http://docs.python-requests.org/en/master/#the-user-guide)
* [BeautifulSoup Documentation](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
* [Stack Overflow Python Requests Questions](https://stackoverflow.com/questions/tagged/python-requests)
* [StackOverflow BeautifulSoup Questions](https://stackoverflow.com/questions/tagged/beautifulsoup)
* [Urllib](https://docs.python.org/3/library/urllib.html#module-urllib)
* [Public APIs](https://github.com/toddmotto/public-apis)
* [API List](https://apilist.fun/)
* [GOOGLE!!!](https://www.google/com)
- [lxml lib](https://lxml.de/)
- [Scrapy](https://scrapy.org/)
- [List of HTTP status codes](https://en.wikipedia.org/wiki/List_of_HTTP_status_codes)
- [HTML basics](http://www.simplehtmlguide.com/cheatsheet.php)
- [CSS basics](https://www.cssbasics.com/#page_start)



#### Below are the libraries and modules you may need. `requests`,  `BeautifulSoup` and `pandas` are already imported for you. If you prefer to use additional libraries feel free to do it.

In [1]:
import requests as r
from bs4 import BeautifulSoup
import pandas as pd

#### Download, parse (using BeautifulSoup), and print the content from the sale page of website:

In [2]:
# These are the urls I have scraped in this project
lst_urls=['https://www.hannaandersson.com/sale/?start=12&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=24&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=48&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=60&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=72&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=84&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=96&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=108&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=120&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=132&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=144&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=156&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=168&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=180&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=192&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=204&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=216&sz=12&format=page-element',
'https://www.hannaandersson.com/sale/?start=216&sz=12&format=page-element']

In [3]:
response=[r.get(url) for url in lst_urls]
response

[<Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>,
 <Response [403]>]

In [4]:
headers="""accept: text/html, */*; q=0.01
accept-encoding: gzip, deflate, br
accept-language: fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7
cache-control: no-cache
cookie: __cfduid=db177e99449198c2582f571806cacecd51606386575; dwanonymous_e4fdf894e6616217dca137d1f8a3f000=bca7iDVLBk7UZL2wB8TbaSRzw6; RfkEnabled=false; __cq_dnt=0; cqcid=bca7iDVLBk7UZL2wB8TbaSRzw6; cquid=||; dw_dnt=0; notice_behavior=expressed,eu; _gcl_au=1.1.1332759810.1606386580; FPC=bd5af2f0-c6de-4e1d-90c30c1a5eb93f44; variantCookie=1; variantCookieTestID=back2criteo100; _ga=GA1.2.198259505.1606386580; _gid=GA1.2.953923278.1606386580; dw=1; dw_cookies_accepted=1; haNewVisitor=here; _fbp=fb.1.1606386581818.462643727; _pin_unauth=dWlkPU4yVTBNalE1TURZdFl6RTJOQzAwTkRCakxXSTNOVEF0WkRrNFpEWTVOamRoT1dVMQ; scarab.visitor=%222F6EB931F62030DA%22; __cq_uuid=bca7iDVLBk7UZL2wB8TbaSRzw6; IR_gbd=hannaandersson.com; __ruid=40293435-86-s5-49-1p-8bxxp3ofr62357vagmib-1606386582496; __rcmp=0!bj1ydzEsZj1ydyxzPTEsYz0yNDQwLHQ9MjAyMDA0MDguMTk1OTtuPXNiMSxmPXNiLHM9MSxjPTI0MzcsdD0yMDIwMDQwOC4yMDUw; bfx.apiKey=0fac4c60-6e15-11ea-ae9f-6965eb1b85ea; bfx.env=PROD; bfx.logLevel=ERROR; extole_access_token=TVPSF1BPU88L99B674HRTNKSPL; bfx.currency=EUR; bfx.language=en; bfx.isInternational=true; bfx.lcpRuleId=; notice_preferences=2:; notice_gdpr_prefs=0,1,2:; cmapi_gtm_bl=; cmapi_cookie_privacy=permit 1,2,3; __olapicU=1606408027741; SIZEBAY_SESSION_ID_V3=1625582F56D36f3d231bb78e4378abca47af504a3dd4; scarab.profile=%2262634%252DGL7%7C1606408223%22; styliticsWidgetSession=92d5534e-7690-4359-9d9c-c736b12d680d; styliticsWidgetData={%22cohortType%22:%22test%22%2C%22visitor_id%22:2902676716}; bfx.sessionId=bb5e3627-f82a-48f6-8bba-0c909b23cd2b; bfx.country=FR; cbt-consent-banner=CROSS-BORDER%20Consent%20Banner; bfx.isWelcomed=true; bfx.currencyQuoteId=71703387; __rfkp=; scarab.mayAdd=%5B%7B%22i%22%3A%2262364-SW5%22%7D%2C%7B%22i%22%3A%2257421-M23%22%7D%2C%7B%22i%22%3A%2265015-011%22%7D%2C%7B%22i%22%3A%2262317-ST0%22%7D%2C%7B%22i%22%3A%2257435-GM3%22%7D%2C%7B%22i%22%3A%2262341-PF8%22%7D%2C%7B%22i%22%3A%2262251-GL7%22%7D%2C%7B%22i%22%3A%2262634-GL7%22%7D%2C%7B%22i%22%3A%2265291-TE5%22%7D%2C%7B%22i%22%3A%2262627-TD6%22%7D%5D; __cq_bc=%7B%22bblm-hannaandersson%22%3A%5B%7B%22id%22%3A%2262634%22%2C%22type%22%3A%22vgroup%22%2C%22alt_id%22%3A%2262634-GL7%22%7D%2C%7B%22id%22%3A%2257435%22%2C%22type%22%3A%22vgroup%22%2C%22alt_id%22%3A%2257435-GM3%22%7D%2C%7B%22id%22%3A%2262627%22%2C%22type%22%3A%22vgroup%22%2C%22alt_id%22%3A%2262627-TD6%22%7D%2C%7B%22id%22%3A%2265291%22%2C%22type%22%3A%22vgroup%22%2C%22alt_id%22%3A%2265291-TE5%22%7D%2C%7B%22id%22%3A%2262251%22%2C%22type%22%3A%22vgroup%22%2C%22alt_id%22%3A%2262251-GL7%22%7D%2C%7B%22id%22%3A%2262341%22%2C%22type%22%3A%22vgroup%22%2C%22alt_id%22%3A%2262341-PF8%22%7D%2C%7B%22id%22%3A%2262317%22%2C%22type%22%3A%22vgroup%22%2C%22alt_id%22%3A%2262317-ST0%22%7D%2C%7B%22id%22%3A%2265015%22%2C%22type%22%3A%22vgroup%22%2C%22alt_id%22%3A%2265015-011%22%7D%2C%7B%22id%22%3A%2257421%22%2C%22type%22%3A%22vgroup%22%2C%22alt_id%22%3A%2257421-M23%22%7D%2C%7B%22id%22%3A%2262364%22%2C%22type%22%3A%22vgroup%22%2C%22alt_id%22%3A%2262364-SW5%22%7D%5D%7D; __cq_seg=0~0.51!1~-0.07!2~-0.40!3~-0.30!4~-0.10!5~-0.20!6~-0.02!7~-0.42!8~-0.21!9~0.46!f0~31~22; dwac_c15d78007bc7c83b06823fd5e8=Iaz7MWH6orFLkJbD08h-YBo55qfZmzCU9os%3D|dw-only|||USD|false|US%2FPacific|true; sid=Iaz7MWH6orFLkJbD08h-YBo55qfZmzCU9os; dwsid=WF3qbMla8ZG46JuTxrb9E2PI9_pxO2O0BfPKKOSxDMQxOs_Smw7ws-0sLgRNNbADVMj5wlfruyCAAgjpPAmL7w==; _fphu=%7B%22value%22%3A%225.414hvI3ylwvTD3HIKgv.1606386583%22%2C%22ts%22%3A1606516086236%7D; IR_PI=4b772917-2fd2-11eb-8667-0a35d197d7d2%7C1606605791433; __rutmb=40293435; ABTasty=uid=xfk1nev9k76xats5&fst=1606386578875&pst=1606516072195&cst=1606519388405&ns=14&pvt=83&pvis=83&th=552141.0.79.2.12.1.1606408052970.1606519394536.1_609645.754796.1.1.1.1.1606468885141.1606468885141.1_630789.782682.83.2.14.1.1606386579098.1606519394559.1_643924.799342.45.2.7.1.1606386578949.1606519394671.1_643925.799343.36.10.7.1.1606408221955.1606503658737.1_645356.801101.36.10.7.1.1606408221197.1606503658654.1_648502.0.36.2.7.1.1606386578962.1606519394692.1; IR_5644=1606519396304%7C417361%7C1606519391433%7C%7C; __rutma=40293435-86-s5-49-1p-8bxxp3ofr62357vagmib-1606386582496.1606516083780.1606519391786.19.61.2; fanplayr=%7B%22uuid%22%3A%221606386583528-702241ba9e3f3eb8df26e0e7%22%2C%22uk%22%3A%225.414hvI3ylwvTD3HIKgv.1606386583%22%2C%22sk%22%3A%222d488f9217c6e74ae69e37b1a9046e37%22%2C%22se%22%3A%22e1.fanplayr.com%22%2C%22tm%22%3A1%2C%22t%22%3A1606519397180%7D; __rpck=0!eyJwcm8iOiJkaXJlY3QiLCJidCI6eyIwIjpmYWxzZSwiMSI6bnVsbCwiMiI6NDk3OSwiMyI6MC4zM30sIkMiOnt9LCJOIjp7fSwiZHRzIjotNjU5LCJjc3AiOnsiYiI6MTI5NDU1LCJ0Ijo2NjkwLCJzcCI6MTU0ODA0LCJjIjo4fX0~; ABTastySession=mrasn=&lp=https://www.hannaandersson.com/sale/&sen=11; _gat_UA-6112906-3=1; __rpckx=0!eyJlYyI6NjUsInQ3Ijp7IjYxIjoxNjA2NTE5Mzk2NjgzfSwidDd2Ijp7IjYxIjoxNjA2NTE5NDU2NzQ2fSwiaXRpbWUiOiIyMDIwMTEyNy4yMzIzIn0~
pragma: no-cache
referer: https://www.hannaandersson.com/sale/
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36"""

In [5]:
headers=dict([i.split(': ') for i in headers.split('\n')])
headers

{'accept': 'text/html, */*; q=0.01',
 'accept-encoding': 'gzip, deflate, br',
 'accept-language': 'fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7',
 'cache-control': 'no-cache',
 'cookie': '__cfduid=db177e99449198c2582f571806cacecd51606386575; dwanonymous_e4fdf894e6616217dca137d1f8a3f000=bca7iDVLBk7UZL2wB8TbaSRzw6; RfkEnabled=false; __cq_dnt=0; cqcid=bca7iDVLBk7UZL2wB8TbaSRzw6; cquid=||; dw_dnt=0; notice_behavior=expressed,eu; _gcl_au=1.1.1332759810.1606386580; FPC=bd5af2f0-c6de-4e1d-90c30c1a5eb93f44; variantCookie=1; variantCookieTestID=back2criteo100; _ga=GA1.2.198259505.1606386580; _gid=GA1.2.953923278.1606386580; dw=1; dw_cookies_accepted=1; haNewVisitor=here; _fbp=fb.1.1606386581818.462643727; _pin_unauth=dWlkPU4yVTBNalE1TURZdFl6RTJOQzAwTkRCakxXSTNOVEF0WkRrNFpEWTVOamRoT1dVMQ; scarab.visitor=%222F6EB931F62030DA%22; __cq_uuid=bca7iDVLBk7UZL2wB8TbaSRzw6; IR_gbd=hannaandersson.com; __ruid=40293435-86-s5-49-1p-8bxxp3ofr62357vagmib-1606386582496; __rcmp=0!bj1ydzEsZj1ydyxzPTEsYz0yNDQwLHQ9MjAyMDA0MD

In [6]:
responses=[r.get(url,headers=headers) for url in lst_urls]
responses

[<Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>,
 <Response [200]>]

""" **Instructions:**

. Find out the html tag and class names used for the products in sale, using CSS Selector.
. Use BeautifulSoup to extract all the html elements that contain the product characteristics.
. Use string manipulation techniques to replace whitespaces and linebreaks (i.e. `\n`) in the *text* of each html element. Use a list to store the clean names.
. Print the list of products."""


In [7]:
resp_content=[response.content for response in responses]
type(resp_content)

list

In [8]:
soups= [BeautifulSoup(resp) for resp in resp_content]

In [9]:
type(soups)

list

In [10]:
productselect=[soup.select('.product__image .product__image--link') for soup in soups]

In [11]:
len(productselect)

18

In [12]:
productselect

[[<a aria-label="Image link for Trek Teal Baby Dress &amp; Bloomer Set In Organic Cotton 57421-PW1" class="product__image--link thumb-link" data-product-id="57421-PW1" href="https://www.hannaandersson.com/baby-girl-dresses-skirts/57421-PW1.html?dwvar_57421-PW1_color=PW1&amp;cgid=Sale" onclick="gtmAnalytics.submitProductImpressionClick(_analytics_43d738c382c04ff37d2473fc41, this, 'image');" title="Baby Dress &amp; Bloomer Set In Organic Cotton">
  <img alt="Product image for 57421-PW1" class="pt-image lazyload" data-src="https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dw02ce41b1/images/main/57421/57421_PW1_60_01.jpg?sw=369&amp;q=90" src="https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dw02ce41b1/images/main/57421/57421_PW1_60_01.jpg?sw=369&amp;q=50"/>
  <img alt="Alternate product image for 57421-PW1" class="alt-image lazyload" data-src="https://www.hannaandersson.com/dw/image

In [13]:
flatproduct = [val for sublist in productselect for val in sublist]
names=[i.get('title') for i in flatproduct]
names

['Baby Dress & Bloomer Set In Organic Cotton',
 'Rugged Rugby',
 'Athletic Jacket',
 'Stripe Jersey Dress',
 'Colorblock Tee',
 'Colorblock Tee',
 'Soft Art Tee',
 'Sunblock Rash Guard',
 'Corduroy Twirl Dress',
 'Baby Bodysuit In Organic Cotton',
 'Disney Princess Tutu In Soft Tulle',
 'Woven Canvas Pants',
 'Athletic Shorts',
 'Rugged Rugby',
 'Woven Canvas Pants',
 'Star Wars™ Glow-In-The-Dark Long John Pajamas In Organic Cotton',
 'Corduroy Twirl Dress',
 'Rugged Rugby',
 'Athletic Jacket',
 'Stripe Jersey Dress',
 'Rugged Rugby',
 'Colorblock Tee',
 'Baby Dress & Bloomer Set In Organic Cotton',
 'Soft Art Tee',
 'Snap Romper In Organic Cotton',
 'Bright Basics Tee In Organic Cotton',
 'Double Knee Woven Pants',
 'Baby Bodysuit In Organic Cotton',
 'Play Appy Tee',
 'Soft Art Tee',
 'Play Tee & Pant Set',
 'Bright Basics Tee In Organic Cotton',
 'Play Appy Tee',
 'Snap Romper In Organic Cotton',
 'Soft Art Tee',
 'Sunblock Rash Guard',
 'Short John Pajamas In Organic Cotton',
 'Bri

In [14]:
print(len(names))
names_vf=names[0:204]
print(len(names_vf))

216
204


In [15]:
imageselect=[soup.select('.pt-image') for soup in soups]
print(len(imageselect))
imageselect

18


[[<img alt="Product image for 57421-PW1" class="pt-image lazyload" data-src="https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dw02ce41b1/images/main/57421/57421_PW1_60_01.jpg?sw=369&amp;q=90" src="https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dw02ce41b1/images/main/57421/57421_PW1_60_01.jpg?sw=369&amp;q=50"/>,
  <img alt="Product image for 51897-RF6" class="pt-image lazyload" data-src="https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dwaa73272f/images/main/51897/51897_RF6_110_01.jpg?sw=369&amp;q=90" src="https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dwaa73272f/images/main/51897/51897_RF6_110_01.jpg?sw=369&amp;q=50"/>,
  <img alt="Product image for 64365-990" class="pt-image lazyload" data-src="https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.stati

In [16]:
flatimage = [val for sublist in imageselect for val in sublist]
images=[i.get('data-src') for i in flatimage]
images

['https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dw02ce41b1/images/main/57421/57421_PW1_60_01.jpg?sw=369&q=90',
 'https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dwaa73272f/images/main/51897/51897_RF6_110_01.jpg?sw=369&q=90',
 'https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dw033ce99e/images/main/64365/64365_990_110_01.jpg?sw=369&q=90',
 'https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dw4d5a3570/images/main/62518/62518_SR4_110_01.jpg?sw=369&q=90',
 'https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dw10fe85c3/images/main/64367/64367_FQ8_110_01.jpg?sw=369&q=90',
 'https://www.hannaandersson.com/dw/image/v2/BBLM_PRD/on/demandware.static/-/Sites-master-catalog/default/dw432a0a5d/images/main/64367/64367_A96_110_01

In [17]:
print(len(images))
images_vf=images[0:204]
print(len(images_vf))

216
204


In [18]:
links=[i.get('href') for i in flatproduct]

In [19]:
print(len(links))
links_vf=links[0:204]
print(len(links_vf))

216
204


In [20]:
stdpriceselect=[soup.select('.pdo-product-price .bfx-original-price') for soup in soups]
print(len(stdpriceselect))
flatstdprice = [val for sublist in stdpriceselect for val in sublist]
flatstdprice

18


[<span class="bfx-original-price pdo-price-standard" tabindex="0"><span class="visually-hidden">Standard Price:</span>$40.00</span>,
 <span class="bfx-original-price pdo-price-standard" tabindex="0"><span class="visually-hidden">Standard Price:</span>$44.00</span>,
 <span class="bfx-original-price pdo-price-standard" tabindex="0"><span class="visually-hidden">Standard Price:</span>$54.00</span>,
 <span class="bfx-original-price pdo-price-standard" tabindex="0"><span class="visually-hidden">Standard Price:</span>$36.00</span>,
 <span class="bfx-original-price pdo-price-standard" tabindex="0"><span class="visually-hidden">Standard Price:</span>$36.00</span>,
 <span class="bfx-original-price pdo-price-standard" tabindex="0"><span class="visually-hidden">Standard Price:</span>$36.00</span>,
 <span class="bfx-original-price pdo-price-standard" tabindex="0"><span class="visually-hidden">Standard Price:</span>$34.00</span>,
 <span class="bfx-original-price pdo-price-standard" tabindex="0"><sp

In [21]:
standard_prices=[float(i.text.strip("Standard Price:$")) for i in flatstdprice]
standard_prices

[40.0,
 44.0,
 54.0,
 36.0,
 36.0,
 36.0,
 34.0,
 32.0,
 52.0,
 16.0,
 54.0,
 50.0,
 38.0,
 44.0,
 50.0,
 50.0,
 52.0,
 44.0,
 54.0,
 36.0,
 44.0,
 36.0,
 40.0,
 28.0,
 68.0,
 18.0,
 48.0,
 16.0,
 28.0,
 34.0,
 60.0,
 18.0,
 28.0,
 68.0,
 34.0,
 32.0,
 42.0,
 18.0,
 36.0,
 28.0,
 50.0,
 50.0,
 42.0,
 18.0,
 40.0,
 36.0,
 50.0,
 60.0,
 60.0,
 50.0,
 60.0,
 36.0,
 46.0,
 44.0,
 28.0,
 36.0,
 50.0,
 54.0,
 20.0,
 50.0,
 38.0,
 32.0,
 54.0,
 28.0,
 48.0,
 46.0,
 14.0,
 48.0,
 38.0,
 38.0,
 20.0,
 30.0,
 20.0,
 50.0,
 46.0,
 50.0,
 46.0,
 48.0,
 60.0,
 36.0,
 44.0,
 36.0,
 24.0,
 42.0,
 38.0,
 52.0,
 34.0,
 20.0,
 68.0,
 54.0,
 48.0,
 48.0,
 48.0,
 40.0,
 20.0,
 66.0,
 50.0,
 42.0,
 38.0,
 18.0,
 28.0,
 16.0,
 50.0,
 48.0,
 38.0,
 28.0,
 40.0,
 42.0,
 42.0,
 34.0,
 42.0,
 48.0,
 36.0,
 38.0,
 36.0,
 18.0,
 48.0,
 38.0,
 48.0,
 36.0,
 38.0,
 30.0,
 28.0,
 34.0,
 28.0,
 54.0,
 50.0,
 50.0,
 24.0,
 34.0,
 38.0,
 96.0,
 18.0,
 38.0,
 42.0,
 34.0,
 48.0,
 44.0,
 30.0,
 34.0,
 46.0,
 18.0,
 30.0,

In [22]:
type(standard_prices)

list

In [23]:
print(len(standard_prices))
standard_prices_vf=standard_prices[0:204]
print(len(standard_prices_vf))

207
204


In [24]:
salepriceselect=[soup.select('.pdo-product-price .bfx-price') for soup in soups]
print(len(salepriceselect))
flatsaleprice = [val for sublist in salepriceselect for val in sublist]
flatsaleprice

18


[<span class="bfx-price pdo-price-sales" tabindex="0"><span class="visually-hidden">Sale Price:</span>$15.99</span>,
 <span class="bfx-price pdo-price-sales" tabindex="0"><span class="visually-hidden">Sale Price:</span>$17.99</span>,
 <span class="bfx-price pdo-price-sales" tabindex="0"><span class="visually-hidden">Sale Price:</span>$21.99</span>,
 <span class="bfx-price pdo-price-sales" tabindex="0"><span class="visually-hidden">Sale Price:</span>$13.99</span>,
 <span class="bfx-price pdo-price-sales" tabindex="0"><span class="visually-hidden">Sale Price:</span>$13.99</span>,
 <span class="bfx-price pdo-price-sales" tabindex="0"><span class="visually-hidden">Sale Price:</span>$13.99</span>,
 <span class="bfx-price pdo-price-sales" tabindex="0"><span class="visually-hidden">Sale Price:</span>$16.99</span>,
 <span class="bfx-price pdo-price-sales" tabindex="0"><span class="visually-hidden">Sale Price:</span>$7.79</span>,
 <span class="bfx-price pdo-price-sales" tabindex="0"><span class

In [25]:
sale_prices=[float(i.text.strip('Sale Price:$')) for i in flatsaleprice]
sale_prices

[15.99,
 17.99,
 21.99,
 13.99,
 13.99,
 13.99,
 16.99,
 7.79,
 12.79,
 7.99,
 21.99,
 19.99,
 14.99,
 17.99,
 19.99,
 18.99,
 12.79,
 17.99,
 21.99,
 13.99,
 17.99,
 13.99,
 15.99,
 10.99,
 26.99,
 6.99,
 18.99,
 7.99,
 10.99,
 16.99,
 14.79,
 6.99,
 6.79,
 26.99,
 16.99,
 7.79,
 20.99,
 6.99,
 8.79,
 6.79,
 24.99,
 19.99,
 16.99,
 6.99,
 15.99,
 8.79,
 19.99,
 23.99,
 14.79,
 23.99,
 23.99,
 13.99,
 17.99,
 10.79,
 6.79,
 13.99,
 12.79,
 21.99,
 4.79,
 29.99,
 14.99,
 7.79,
 21.99,
 13.99,
 18.99,
 17.99,
 7.0,
 18.99,
 14.99,
 18.99,
 4.79,
 15.0,
 4.79,
 19.99,
 17.99,
 19.99,
 17.99,
 23.99,
 23.99,
 13.99,
 17.99,
 13.99,
 5.79,
 21.0,
 14.99,
 20.99,
 16.99,
 10.0,
 33.99,
 21.99,
 18.99,
 23.99,
 23.99,
 15.99,
 7.99,
 39.99,
 29.99,
 16.99,
 14.99,
 6.99,
 10.99,
 7.99,
 24.99,
 23.99,
 14.99,
 13.99,
 15.99,
 20.99,
 24.99,
 16.99,
 24.99,
 11.79,
 17.99,
 18.99,
 17.99,
 6.99,
 23.99,
 14.99,
 23.99,
 17.99,
 18.99,
 7.79,
 6.79,
 16.99,
 13.99,
 26.99,
 29.99,
 23.99,
 5.79

In [26]:
print(len(sale_prices))
sale_prices_vf=sale_prices[0:204]
print(len(sale_prices_vf))

207
204


In [27]:
largeratingselect=[soup.select('.product__ratings') for soup in soups]
len(largeratingselect)

18

In [28]:
for a in range(0,17):
    for j in range(0,12):
        print(largeratingselect[a][j])

<div class="product__ratings">
<div class="TTteaser TTteaser-tile" data-productid="57421-PW1" data-starrating="4.5"></div>
</div>
<div class="product__ratings">
<div class="TTteaser TTteaser-tile" data-productid="51897-RF6" data-starrating="4.5"></div>
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
<div class="TTteaser TTteaser-tile" data-productid="51897-QV2" data-starrating="4.5"></div>
</div>
<div class="product__ratings">
</div>
<div class="product__ratings">
<div class="TTteaser TTteaser-tile" data-productid="46123-JU4" data-starrating="4.5"></div>
</div>
<div class="product__rat

In [29]:
for a in range(0,17):
    for j in range(0,12):
        print(len(str(list(largeratingselect[a][j]))))

105
105
6
6
6
6
6
6
6
6
6
6
6
105
6
105
6
105
6
6
105
6
105
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
105
6
105
6
6
6
6
105
6
6
6
6
6
6
6
6
105
6
6
6
6
6
6
105
105
6
6
6
105
105
105
6
6
6
6
105
6
6
105
105
6
105
6
6
6
105
6
6
6
6
6
105
105
6
6
105
6
6
6
6
6
6
6
6
105
6
105
6
6
6
6
6
105
6
6
6
6
6
6
6
6
6
6
6
6
105
6
6
6
6
6
6
105
6
105
6
6
6
6
6
6
6
6
6
105
6
105
105
105
105
105
6
105
105
6
6
6
105
105
6
6
6
105
6
105
6
6
105
105
105
6
6
6
105
6
105
6
105
6
6
105
105
105
105
105
105
105
105
105
105
6
105
105
105
105
6
105
105
105
6
6
105


In [30]:
ratings=[]
for a in range(0,17):
    for j in range(0,12):
        if len(str(list(largeratingselect[a][j]))) <=8:
            x='NotRated'
        else:
            x=str(list(largeratingselect[a][j])[1]).split(' ')[4].strip('da"t-sring=</v>')
        ratings.append(x)
ratings

['4.5',
 '4.5',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 '4.5',
 'NotRated',
 '4.5',
 'NotRated',
 '4.5',
 'NotRated',
 'NotRated',
 '4.5',
 'NotRated',
 '4.5',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 '4.5',
 'NotRated',
 '5.0',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 '4.5',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 '5.0',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 '5.0',
 '5.0',
 'NotRated',
 'NotRated',
 'NotRated',
 '5.0',
 '5.0',
 '5.0',
 'NotRated',
 'NotRated',
 'NotRated',
 'NotRated',
 '4.5',
 'NotRated',
 'NotRated',
 '4.5',
 '4.5',
 'NotRated',
 '4.5',
 'NotRat

In [31]:
print(len(ratings))
ratings_vf=ratings[0:204]
print(len(ratings_vf))

204
204


In [32]:
categoryselect=[soup.select("div.product script[type]") for soup in soups]
len(categoryselect) 

18

In [33]:
import re

In [34]:
flattened6 = [val for sublist in categoryselect for val in sublist]
product_category=re.findall("\"dimension7\" : \"(.*?)\"", ' '.join([i.text for i in flattened6]))
product_category

['baby-girl-dresses-skirts',
 'boys-clothing-tops-shirts',
 'boys-clothing-jackets-coats',
 'girls-clothing-dresses',
 'boys-clothing-tops-shirts',
 'boys-clothing-tops-shirts',
 'girls-clothing-tops-shirts',
 'baby-girl-swimsuits',
 'girls-clothing-dresses',
 'baby-girl-tops-shirts',
 'girls-clothes-skirts-shorts',
 'boys-clothing-pants-shorts',
 'boys-clothing-shorts',
 'boys-clothing-tops-shirts',
 'boys-clothing-pants-shorts',
 'pajamas-kids',
 'girls-clothing-dresses',
 'boys-clothing-tops-shirts',
 'boys-clothing-jackets-coats',
 'girls-clothing-dresses',
 'boys-clothing-tops-shirts',
 'boys-clothing-tops-shirts',
 'baby-girl-dresses-skirts',
 'girls-clothing-tops-shirts',
 'baby-girl-rompers',
 'baby-girl-tops-shirts',
 'boys-clothing-pants-shorts',
 'baby-girl-tops-shirts',
 'baby-girl-tops-shirts',
 'girls-clothing-tops-shirts',
 'baby-girl-pants-leggings-shorts',
 'baby-girl-tops-shirts',
 'baby-girl-tops-shirts',
 'baby-girl-rompers',
 'girls-clothing-tops-shirts',
 'baby-gi

In [35]:
print(len(product_category))
product_category_vf=product_category[0:204]
print(len(product_category_vf))

216
204


In [36]:
product_color=[re.findall("\"variant\" : \"(.*?)\"", ' '.join([i.text for i in flattened6]))]
flatcolors = [val for sublist in product_color for val in sublist]
flatcolors

['Trek Teal',
 'Navy/Golden Hour',
 'Multi',
 'Navy Blue',
 'Hanna Red Multi',
 'Navy Multi',
 'Sunshine',
 'Map Blue',
 'Sunshine',
 'Lilla Rosa',
 'Belle',
 'Deep Olive',
 'Multi',
 'Navy/Fjord Green',
 'Carpenter Brown',
 'Chewbacca',
 'Positively Purple',
 'Navy/Hanna Red',
 'Navy Blue',
 'Pink Peony',
 'Navy/Foggy Blue',
 'Multi',
 'Sweet Lavender',
 'Hanna White',
 'Unicorns Hanna White',
 'Power Pink',
 'Expedition Green',
 'Heather Grey',
 'Foggy Blue',
 'Lookout Blue',
 'Ecru',
 'Sweet Lavender',
 'Soft Sage',
 'Bright Butterflies',
 'Navy Blue',
 'Navy',
 'Apple',
 'Hanna White Multi',
 'Multi',
 'Pink Peony',
 'Spider-Man',
 'Positively Purple',
 'Dalmation',
 'Hanna Red',
 'Fancy Frogs',
 'Hanna White Multi',
 'Lookout Blue',
 'Foggy Blue',
 'Multi',
 'Darth Vader Black',
 'Magic Bloom',
 'Navy Multi',
 'Navy Blue',
 'Sea Salt/ Bunny Brown',
 'Ecru',
 'Hanna Red',
 'Pink Peony',
 'Ariel',
 'Pumpkin',
 'Frozen 2',
 'Ariel',
 'White',
 'Rapunzel',
 'Navy Blue',
 'Iron Man',
 

In [37]:
print(len(flatcolors))
flatcolors_vf=flatcolors[:204]
print(len(flatcolors_vf))

216
204


In [101]:
dict={"Name":names_vf,
      "Color":flatcolors_vf,
      "Product_Category":product_category_vf,
      "Rating":ratings_vf,
      "Standard_Price_$":standard_prices_vf,
      "Sale_Price_$":sale_prices_vf,
      "Image":images_vf,
      "Product_Link":links_vf,}

In [102]:
df = pd.DataFrame(dict)
df

Unnamed: 0,Name,Color,Product_Category,Rating,Standard_Price_$,Sale_Price_$,Image,Product_Link
0,Baby Dress & Bloomer Set In Organic Cotton,Trek Teal,baby-girl-dresses-skirts,4.5,40.0,15.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/baby-girl-dress...
1,Rugged Rugby,Navy/Golden Hour,boys-clothing-tops-shirts,4.5,44.0,17.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/boys-clothing-t...
2,Athletic Jacket,Multi,boys-clothing-jackets-coats,NotRated,54.0,21.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/boys-clothing-j...
3,Stripe Jersey Dress,Navy Blue,girls-clothing-dresses,NotRated,36.0,13.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/girls-clothing-...
4,Colorblock Tee,Hanna Red Multi,boys-clothing-tops-shirts,NotRated,36.0,13.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/boys-clothing-t...
...,...,...,...,...,...,...,...,...
199,Baby Zip Sleeper In Organic Cotton,Purple Hills/Hanna White,pajamas-baby,5.0,42.0,20.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-baby/33...
200,Long John Pajamas In Organic Cotton,The Pumpkin,pajamas-kids,4.5,42.0,16.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-kids/51...
201,Long John Pajamas In Organic Cotton,Fall Animals,pajamas-kids,NotRated,32.0,7.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-kids/64...
202,Halloween Rib Knit Dress,Magic Cats,girls-clothing-dresses,NotRated,42.0,16.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/girls-clothing-...


In [103]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 204 entries, 0 to 203
Data columns (total 8 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   Name              204 non-null    object 
 1   Color             204 non-null    object 
 2   Product_Category  204 non-null    object 
 3   Rating            204 non-null    object 
 4   Standard_Price_$  204 non-null    float64
 5   Sale_Price_$      204 non-null    float64
 6   Image             204 non-null    object 
 7   Product_Link      204 non-null    object 
dtypes: float64(2), object(6)
memory usage: 12.9+ KB


In [104]:
pd.to_numeric(df['Rating'], errors='coerce')

0      4.5
1      4.5
2      NaN
3      NaN
4      NaN
      ... 
199    5.0
200    4.5
201    NaN
202    NaN
203    4.5
Name: Rating, Length: 204, dtype: float64

In [106]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 204 entries, 0 to 203
Data columns (total 8 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   Name              204 non-null    object 
 1   Color             204 non-null    object 
 2   Product_Category  204 non-null    object 
 3   Rating            204 non-null    object 
 4   Standard_Price_$  204 non-null    float64
 5   Sale_Price_$      204 non-null    float64
 6   Image             204 non-null    object 
 7   Product_Link      204 non-null    object 
dtypes: float64(2), object(6)
memory usage: 12.9+ KB


In [107]:
df.duplicated()

0      False
1      False
2      False
3      False
4      False
       ...  
199    False
200    False
201    False
202    False
203    False
Length: 204, dtype: bool

In [108]:
df.isna().sum()

Name                0
Color               0
Product_Category    0
Rating              0
Standard_Price_$    0
Sale_Price_$        0
Image               0
Product_Link        0
dtype: int64

In [109]:
pajamas=df.sort_values(['Sale_Price_$','Standard_Price_$'], ascending=False, ignore_index=True)
pajamas

Unnamed: 0,Name,Color,Product_Category,Rating,Standard_Price_$,Sale_Price_$,Image,Product_Link
0,Women's Long John Set In Organic Cotton,Harvest Fairisle,pajamas-adult-women,NotRated,96.0,57.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-adult-w...
1,Adult Short John Pajamas In Organic Cotton,Pink Peony Multi,pajamas-adult,NotRated,66.0,39.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-adult/6...
2,Stretch Denim Overalls,Vintage Wash,girls-clothing-pants-leggings-shorts,NotRated,68.0,33.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/girls-clothing-...
3,Long John Pajamas In Organic Cotton,Unicorn Character,pajamas-kids,4.5,48.0,33.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-kids/49...
4,Disney Frozen 2 Long John Pajamas,Frozen 2,pajamas-kids,NotRated,50.0,29.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-kids/62...
...,...,...,...,...,...,...,...,...
199,Play Sock 3-Pack,Unicorn 3pk,sale-event-hs,NotRated,24.0,5.79,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/sale-event-hs/6...
200,Play Sock 3-Pack,Dino 3pk,sale-event-hs,NotRated,24.0,5.79,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/sale-event-hs/6...
201,Who Will You Be Cap,Pumpkin,accessories-baby-hats,5.0,20.0,4.79,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/accessories-bab...
202,Who Will You Be Cap,Night Flight,accessories-baby-hats,5.0,20.0,4.79,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/accessories-bab...


In [110]:
pajamas.insert(loc=6,column='Sale_%',value=0)
pajamas.head()

Unnamed: 0,Name,Color,Product_Category,Rating,Standard_Price_$,Sale_Price_$,Sale_%,Image,Product_Link
0,Women's Long John Set In Organic Cotton,Harvest Fairisle,pajamas-adult-women,NotRated,96.0,57.99,0,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-adult-w...
1,Adult Short John Pajamas In Organic Cotton,Pink Peony Multi,pajamas-adult,NotRated,66.0,39.99,0,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-adult/6...
2,Stretch Denim Overalls,Vintage Wash,girls-clothing-pants-leggings-shorts,NotRated,68.0,33.99,0,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/girls-clothing-...
3,Long John Pajamas In Organic Cotton,Unicorn Character,pajamas-kids,4.5,48.0,33.99,0,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-kids/49...
4,Disney Frozen 2 Long John Pajamas,Frozen 2,pajamas-kids,NotRated,50.0,29.99,0,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-kids/62...


In [111]:
pajamas['Sale_%']=pajamas['Sale_Price_$']*100/pajamas['Standard_Price_$']
pajamas.head()

Unnamed: 0,Name,Color,Product_Category,Rating,Standard_Price_$,Sale_Price_$,Sale_%,Image,Product_Link
0,Women's Long John Set In Organic Cotton,Harvest Fairisle,pajamas-adult-women,NotRated,96.0,57.99,60.40625,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-adult-w...
1,Adult Short John Pajamas In Organic Cotton,Pink Peony Multi,pajamas-adult,NotRated,66.0,39.99,60.590909,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-adult/6...
2,Stretch Denim Overalls,Vintage Wash,girls-clothing-pants-leggings-shorts,NotRated,68.0,33.99,49.985294,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/girls-clothing-...
3,Long John Pajamas In Organic Cotton,Unicorn Character,pajamas-kids,4.5,48.0,33.99,70.8125,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-kids/49...
4,Disney Frozen 2 Long John Pajamas,Frozen 2,pajamas-kids,NotRated,50.0,29.99,59.98,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-kids/62...


In [116]:
pajamasTwoDecimals = pajamas.round(decimals=2)
pajamasTwoDecimals.head()

Unnamed: 0,Name,Color,Product_Category,Rating,Standard_Price_$,Sale_Price_$,Sale_%,Image,Product_Link
0,Women's Long John Set In Organic Cotton,Harvest Fairisle,pajamas-adult-women,NotRated,96.0,57.99,60.41,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-adult-w...
1,Adult Short John Pajamas In Organic Cotton,Pink Peony Multi,pajamas-adult,NotRated,66.0,39.99,60.59,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-adult/6...
2,Stretch Denim Overalls,Vintage Wash,girls-clothing-pants-leggings-shorts,NotRated,68.0,33.99,49.99,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/girls-clothing-...
3,Long John Pajamas In Organic Cotton,Unicorn Character,pajamas-kids,4.5,48.0,33.99,70.81,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-kids/49...
4,Disney Frozen 2 Long John Pajamas,Frozen 2,pajamas-kids,NotRated,50.0,29.99,59.98,https://www.hannaandersson.com/dw/image/v2/BBL...,https://www.hannaandersson.com/pajamas-kids/62...


In [121]:
pajamasTwoDecimals.Rating.mode()

0    NotRated
dtype: object

In [122]:
pajamasTwoDecimals.Rating.value_counts()*100/pajamasTwoDecimals.shape[0]

NotRated    67.647059
4.5         16.666667
5.0         14.705882
3.0          0.490196
1.0          0.490196
Name: Rating, dtype: float64

In [124]:
pajamasTwoDecimals.describe()

Unnamed: 0,Standard_Price_$,Sale_Price_$,Sale_%
count,204.0,204.0,204.0
mean,40.392157,17.656078,43.184902
std,11.907821,7.194817,9.812842
min,14.0,4.79,23.95
25%,34.0,13.99,39.25
50%,42.0,16.99,40.805
75%,48.0,23.24,49.98
max,96.0,57.99,70.81


In [125]:
pajamasTwoDecimals.median()

Standard_Price_$    42.000
Sale_Price_$        16.990
Sale_%              40.805
dtype: float64

In [126]:
data=pajamasTwoDecimals.to_csv("pajamasTwoDecimals.csv", index=False)