# Web Scraping FlipKart using Beautiful Soup

The details of mobile phones from flipkart are scraped using Beautiful Soup 

Importing all necessary variables

In [1]:
from bs4 import BeautifulSoup
import requests
import csv
import pandas as pd


The URL of the site is passed to variable request

In [14]:
url = 'https://www.flipkart.com/search?q=smartphone&sid=tyy%2C4io&as=on&as-show=on&otracker=AS_QueryStore_OrganicAutoSuggest_1_7_na_na_na&otracker1=AS_QueryStore_OrganicAutoSuggest_1_7_na_na_na&as-pos=1&as-type=HISTORY&suggestionId=smartphone%7CMobiles&requestId=f932e878-8fb8-4289-8266-e1c8b556539e'
r = requests.get(url)

Checking for the status code. If code is 200, no error; go forward for the scraping

In [15]:
r

<Response [200]>

### Parsing Html

Parsing html using Beautiful Soup and viewing the content in the html file.
prettify() method automatically alligns the data and improves readability.

In [3]:
content = BeautifulSoup(r.content, "html.parser")
print(content.prettify())

<!DOCTYPE html>
<html lang="en">
 <head>
  <link href="https://rukminim1.flixcart.com" rel="preconnect"/>
  <link href="//static-assets-web.flixcart.com/fk-p-linchpin-web/fk-cp-zion/css/app_modules.chunk.905c37.css" rel="stylesheet"/>
  <link href="//static-assets-web.flixcart.com/fk-p-linchpin-web/fk-cp-zion/css/app.chunk.c46047.css" rel="stylesheet"/>
  <meta content="text/html; charset=utf-8" http-equiv="Content-type"/>
  <meta content="IE=Edge" http-equiv="X-UA-Compatible"/>
  <meta content="102988293558" property="fb:page_id"/>
  <meta content="658873552,624500995,100000233612389" property="fb:admins"/>
  <meta content="noodp" name="robots"/>
  <link href="https:///www/promos/new/20150528-140547-favicon-retina.ico" rel="shortcut icon"/>
  <link href="/osdd.xml?v=2" rel="search" type="application/opensearchdescription+xml"/>
  <meta content="website" property="og:type"/>
  <meta content="Flipkart.com" name="og_site_name" property="og:site_name"/>
  <link href="/apple-touch-icon-57x

### Getting Values from Html 

Getting the names of all the mobile phones shown in the category (with tags).

In [4]:
name = content.find_all('div', attrs={'class': '_4rR01T'})
name


[<div class="_4rR01T">SAMSUNG Galaxy F23 5G (Forest Green, 128 GB)</div>,
 <div class="_4rR01T">SAMSUNG Galaxy F23 5G (Forest Green, 128 GB)</div>,
 <div class="_4rR01T">MOTOROLA e40 (Carbon Gray, 64 GB)</div>,
 <div class="_4rR01T">POCO M3 Pro 5G (Yellow, 128 GB)</div>,
 <div class="_4rR01T">SAMSUNG Galaxy F23 5G (Aqua Blue, 128 GB)</div>,
 <div class="_4rR01T">SAMSUNG Galaxy F04 (Jade Purple, 64 GB)</div>,
 <div class="_4rR01T">APPLE iPhone 14 (Starlight, 128 GB)</div>,
 <div class="_4rR01T">SAMSUNG Galaxy F23 5G (Copper Blush, 128 GB)</div>,
 <div class="_4rR01T">MOTOROLA G62 5G (Midnight Gray, 128 GB)</div>,
 <div class="_4rR01T">APPLE iPhone 13 (Blue, 128 GB)</div>,
 <div class="_4rR01T">APPLE iPhone 14 (Blue, 128 GB)</div>,
 <div class="_4rR01T">REDMI 10 (Caribbean Green, 64 GB)</div>,
 <div class="_4rR01T">REDMI Note 11 SE (Cosmic White, 64 GB)</div>,
 <div class="_4rR01T">MOTOROLA G32 (Mineral Gray, 64 GB)</div>,
 <div class="_4rR01T">SAMSUNG Galaxy F23 5G (Aqua Blue, 128 GB)</

Selecting the text values and storing them in a list (excluding tags).

In [5]:
prod = []
for i in name:
    prod.append(i.text)
prod

['SAMSUNG Galaxy F23 5G (Forest Green, 128 GB)',
 'SAMSUNG Galaxy F23 5G (Forest Green, 128 GB)',
 'MOTOROLA e40 (Carbon Gray, 64 GB)',
 'POCO M3 Pro 5G (Yellow, 128 GB)',
 'SAMSUNG Galaxy F23 5G (Aqua Blue, 128 GB)',
 'SAMSUNG Galaxy F04 (Jade Purple, 64 GB)',
 'APPLE iPhone 14 (Starlight, 128 GB)',
 'SAMSUNG Galaxy F23 5G (Copper Blush, 128 GB)',
 'MOTOROLA G62 5G (Midnight Gray, 128 GB)',
 'APPLE iPhone 13 (Blue, 128 GB)',
 'APPLE iPhone 14 (Blue, 128 GB)',
 'REDMI 10 (Caribbean Green, 64 GB)',
 'REDMI Note 11 SE (Cosmic White, 64 GB)',
 'MOTOROLA G32 (Mineral Gray, 64 GB)',
 'SAMSUNG Galaxy F23 5G (Aqua Blue, 128 GB)',
 'SAMSUNG Galaxy F23 5G (Aqua Blue, 128 GB)',
 'MOTOROLA G62 5G (Frosted Blue, 128 GB)',
 'REDMI 10 (Midnight Black, 64 GB)',
 'REDMI 10 (Pacific Blue, 64 GB)',
 'MOTOROLA e40 (Pink Clay, 64 GB)',
 'MOTOROLA G32 (Satin Silver, 64 GB)',
 'POCO C31 (Royal Blue, 64 GB)',
 'REDMI Note 12 Pro+ 5G (Obsidian Black, 256 GB)',
 'SAMSUNG Galaxy F04 (Opal Green, 64 GB)']

Getting the prices of all the mobile phones shown in the category (with tags).

In [6]:
price = content.find_all('div', attrs={'class': '_30jeq3 _1_WHN1'})
price

[<div class="_30jeq3 _1_WHN1">₹16,999</div>,
 <div class="_30jeq3 _1_WHN1">₹16,999</div>,
 <div class="_30jeq3 _1_WHN1">₹7,999</div>,
 <div class="_30jeq3 _1_WHN1">₹17,999</div>,
 <div class="_30jeq3 _1_WHN1">₹16,999</div>,
 <div class="_30jeq3 _1_WHN1">₹8,499</div>,
 <div class="_30jeq3 _1_WHN1">₹72,999</div>,
 <div class="_30jeq3 _1_WHN1">₹16,999</div>,
 <div class="_30jeq3 _1_WHN1">₹14,999</div>,
 <div class="_30jeq3 _1_WHN1">₹61,999</div>,
 <div class="_30jeq3 _1_WHN1">₹72,999</div>,
 <div class="_30jeq3 _1_WHN1">₹9,999</div>,
 <div class="_30jeq3 _1_WHN1">₹16,999</div>,
 <div class="_30jeq3 _1_WHN1">₹9,999</div>,
 <div class="_30jeq3 _1_WHN1">₹16,999</div>,
 <div class="_30jeq3 _1_WHN1">₹15,999</div>,
 <div class="_30jeq3 _1_WHN1">₹14,999</div>,
 <div class="_30jeq3 _1_WHN1">₹9,999</div>,
 <div class="_30jeq3 _1_WHN1">₹9,999</div>,
 <div class="_30jeq3 _1_WHN1">₹7,999</div>,
 <div class="_30jeq3 _1_WHN1">₹9,999</div>,
 <div class="_30jeq3 _1_WHN1">₹7,749</div>,
 <div class="_30jeq

Selecting the values and storing them in a list (excluding tags).

In [7]:
pricing = []
for i in price:
    pricing.append(i.text)
pricing

['₹16,999',
 '₹16,999',
 '₹7,999',
 '₹17,999',
 '₹16,999',
 '₹8,499',
 '₹72,999',
 '₹16,999',
 '₹14,999',
 '₹61,999',
 '₹72,999',
 '₹9,999',
 '₹16,999',
 '₹9,999',
 '₹16,999',
 '₹15,999',
 '₹14,999',
 '₹9,999',
 '₹9,999',
 '₹7,999',
 '₹9,999',
 '₹7,749',
 '₹29,999',
 '₹9,499']

Getting the ratings of all the mobile phones shown in the category (with tags).

In [8]:
rate = content.find_all('div', attrs={'class': '_3LWZlK'})
rate

[<div class="_3LWZlK">4.3</div>,
 <div class="_3LWZlK">4.1</div>,
 <div class="_3LWZlK">4.2<img class="_1wB99o" src="

Getting the ratings of all the mobile phones shown in the category (with tags).

In [9]:
rating = []
for i in rate:
    rating.append(i.text)
rating

['4.3',
 '4.1',
 '4.2',
 '4.5',
 '4.7',
 '4.3',
 '4.2',
 '4.7',
 '4.7',
 '4.3',
 '4.3',
 '4.2',
 '4.3',
 '4.2',
 '4.2',
 '4.3',
 '4.3',
 '4.1',
 '4.2',
 '4.3',
 '4.3',
 '4.5',
 '4.2',
 '5',
 '5',
 '4.3',
 '4',
 '5',
 '4.5',
 '3',
 '2',
 '4.2',
 '5',
 '2',
 '4.3',
 '5',
 '5']

In [16]:
data = {"Name": prod, "Price": pricing, "Rating": rating}
print(data)


{'Name': ['SAMSUNG Galaxy F23 5G (Forest Green, 128 GB)', 'SAMSUNG Galaxy F23 5G (Forest Green, 128 GB)', 'MOTOROLA e40 (Carbon Gray, 64 GB)', 'POCO M3 Pro 5G (Yellow, 128 GB)', 'SAMSUNG Galaxy F23 5G (Aqua Blue, 128 GB)', 'SAMSUNG Galaxy F04 (Jade Purple, 64 GB)', 'APPLE iPhone 14 (Starlight, 128 GB)', 'SAMSUNG Galaxy F23 5G (Copper Blush, 128 GB)', 'MOTOROLA G62 5G (Midnight Gray, 128 GB)', 'APPLE iPhone 13 (Blue, 128 GB)', 'APPLE iPhone 14 (Blue, 128 GB)', 'REDMI 10 (Caribbean Green, 64 GB)', 'REDMI Note 11 SE (Cosmic White, 64 GB)', 'MOTOROLA G32 (Mineral Gray, 64 GB)', 'SAMSUNG Galaxy F23 5G (Aqua Blue, 128 GB)', 'SAMSUNG Galaxy F23 5G (Aqua Blue, 128 GB)', 'MOTOROLA G62 5G (Frosted Blue, 128 GB)', 'REDMI 10 (Midnight Black, 64 GB)', 'REDMI 10 (Pacific Blue, 64 GB)', 'MOTOROLA e40 (Pink Clay, 64 GB)', 'MOTOROLA G32 (Satin Silver, 64 GB)', 'POCO C31 (Royal Blue, 64 GB)', 'REDMI Note 12 Pro+ 5G (Obsidian Black, 256 GB)', 'SAMSUNG Galaxy F04 (Opal Green, 64 GB)'], 'Price': ['₹16,999'

### Saving the data to DataFrame 

Creating a dataframe to arrange the data collected 

In [11]:
df = pd.DataFrame()

In [12]:
df["Name"] = prod
df["Price"]= pricing
df

Unnamed: 0,Name,Price
0,"SAMSUNG Galaxy F23 5G (Forest Green, 128 GB)","₹16,999"
1,"SAMSUNG Galaxy F23 5G (Forest Green, 128 GB)","₹16,999"
2,"MOTOROLA e40 (Carbon Gray, 64 GB)","₹7,999"
3,"POCO M3 Pro 5G (Yellow, 128 GB)","₹17,999"
4,"SAMSUNG Galaxy F23 5G (Aqua Blue, 128 GB)","₹16,999"
5,"SAMSUNG Galaxy F04 (Jade Purple, 64 GB)","₹8,499"
6,"APPLE iPhone 14 (Starlight, 128 GB)","₹72,999"
7,"SAMSUNG Galaxy F23 5G (Copper Blush, 128 GB)","₹16,999"
8,"MOTOROLA G62 5G (Midnight Gray, 128 GB)","₹14,999"
9,"APPLE iPhone 13 (Blue, 128 GB)","₹61,999"


### Converting to CSV

The dataframe is converted into a csv file in the specified location.

In [13]:
df.to_csv("Fproduct.csv")

Result : The .csv file is available with this jupyter file