# Web Scrapping Flipkart

## Overview:
In this web scraping project, I set out to gather information about mobile phones listed on [flipkart's](https://www.flipkart.com/mobiles-accessories/mobiles/pr?sid=tyy%2C4io&ctx=eyJjYXJkQ29udGV4dCI6eyJhdHRyaWJ1dGVzIjp7InRpdGxlIjp7Im11bHRpVmFsdWVkQXR0cmlidXRlIjp7ImtleSI6InRpdGxlIiwiaW5mZXJlbmNlVHlwZSI6IlRJVExFIiwidmFsdWVzIjpbIlByZW1pdW0gTW9iaWxlcyDigrkyMCwwMDArIl0sInZhbHVlVHlwZSI6Ik1VTFRJX1ZBTFVFRCJ9fX19fQ%3D%3D&wid=57.productCard.PMU_V2_18&page=1) mobile section. Focusing on the first thirty pages out of a total of 396, I collected data on various attributes, including phone names, ratings, reviews, prices, offers, specifications (RAM, ROM, expandable memory, screen size, cameras, battery, and processor), and more.

#### Scope and Objectives:

- Target Website: Flipkart Mobiles Section
##### Data Points Scraped:
- Phone Names
- Ratings and Reviews
- Prices and Offers
- RAM, ROM, Expandable Memory
- Screen Size
- Rear and Front Cameras
- Battery Information
- Processor Details
- Project Steps:

##### Data Collection:

Utilized the `requests` library to fetch HTML content from Flipkart.

Employed `BeautifulSoup` for parsing and navigating the HTML structure.

Scraped information from 'div' elements with the class '_2kHMtA'.

##### Data Processing:
Extracted relevant details from individual elements, handling cases where information was missing or structured differently.


In [1]:
import pandas as pd
from bs4 import BeautifulSoup
import requests

In [75]:
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}

phone_names = []
ratings = []
reviews = []
price = []
offer = []
ram = []
rom = []
screen_size = []
expandable = []
rear_camera = []
front_camera = []
battery = []
processor = []


for i in range(1, 30):  # Adjust the range as needed
    url = f'https://www.flipkart.com/mobiles-accessories/mobiles/pr?sid=tyy%2C4io&ctx=eyJjYXJkQ29udGV4dCI6eyJhdHRyaWJ1dGVzIjp7InRpdGxlIjp7Im11bHRpVmFsdWVkQXR0cmlidXRlIjp7ImtleSI6InRpdGxlIiwiaW5mZXJlbmNlVHlwZSI6IlRJVExFIiwidmFsdWVzIjpbIlByZW1pdW0gTW9iaWxlcyDigrkyMCwwMDArIl0sInZhbHVlVHlwZSI6Ik1VTFRJX1ZBTFVFRCJ9fX19fQ%3D%3D&wid=57.productCard.PMU_V2_18&page={i}'
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.text, 'html.parser')
    div_elements = soup.find_all('div', {'class': '_2kHMtA'})

    for element in div_elements:
        # Extract phone names
        phone_name_element = element.find('div', class_='_4rR01T')
        phone_name = phone_name_element.text.split(',')[0].replace('(', "") if phone_name_element else None
        phone_names.append(phone_name)

        # Extract ratings
        rating_element = element.find('span', class_='_2_R_DZ')
        if rating_element:
            rating = rating_element.text
            ratings.append(rating.split()[0])
            reviews.append(rating.split()[3])
        else:
            ratings.append(None)
            reviews.append(None)
        
        # Extract origianl prices and offer prices
        original_price_element = element.find('div', class_ = '_3I9_wc _27UcVY')
        phone_price = original_price_element.text if original_price_element else None
        price.append(phone_price)
        offer_price_element = element.find('div', '_30jeq3 _1_WHN1')
        phone_offer_price = offer_price_element.text if offer_price_element else None
        offer.append(phone_offer_price)
        
        # Extract phone specifications
        desc_element = element.find_all('ul', class_ = '_1xgFaf')
        #print(desc_element)
        for description in desc_element:
            phone_spec = description.find_all('li', class_="rgWa7D")
           
            # ram, rom expandable 
            for e in phone_spec[0]:
                phone_ram = e.text.split('|') if e else []
                if phone_ram:
                    ram.append(phone_ram[0])
                if len(phone_ram) > 1:
                     rom.append(phone_ram[1].strip())  # Strip any leading/trailing whitespace
                else:
                    rom.append(None)
                    
                if len(phone_ram) > 2:
                    expandable_memory = 'yes' if 'Expandable' in phone_ram[2] else 'No'
                    expandable.append(expandable_memory)
                else:
                    expandable.append(None)
                    
               
            # Screen size
            for e in phone_spec[1]:
                screen_size.append(e.split('Full')[0])
                
            # Camera specification
            for e in phone_spec[2]:
                camera = e.split('|')
                # print(camera)
                
                if camera:
                    rear_camera.append(camera[0])
                else:
                    rear_camera.append(None)
                
                if len(camera) > 1:
                    front_camera.append(camera[1])
                else:
                    front_camera.append(None)  
            
            # Battery details
            if len(phone_spec) > 3:        
                for e in phone_spec[3]:
                    if 'Battery' in e:
                        battery.append(e)
                    else:
                        battery.append(None)
            else:
                battery.append(None)
               
            # Processor details
            if len(phone_spec) > 4:   
                for e in phone_spec[4]:
                    if 'Processor' in e:
                        processor.append(e)
                    else:
                        processor.append(None)
            else:
                processor.append(None)
                    

In [76]:
# converting the collected datas into a dataframe.

flipkart_mobiles = pd.DataFrame( {'phone_names': phone_names,
                                  'ratings': ratings,
                                  'reviews': reviews,
                                  'price': price,
                                   'offer': offer,
                                   'ram': ram,
                                   'rom': rom,
                                   'screen_size':screen_size,
                                   'expandable':expandable,
                                   'rear_camera': rear_camera,
                                   'front_camera':front_camera,
                                   'battery': battery,
                                   'processor':processor})

In [77]:
flipkart_mobiles

Unnamed: 0,phone_names,ratings,reviews,price,offer,ram,rom,screen_size,expandable,rear_camera,front_camera,battery,processor
0,SAMSUNG Galaxy F14 5G GOAT Green,69173,4895,"₹18,490","₹14,990",6 GB RAM,128 GB ROM,16.76 cm (6.6 inch),yes,50MP + 2MP,13MP Front Camera,6000 mAh Battery,"Exynos 1330, Octa Core Processor"
1,vivo T2x 5G Glimmer Black,50584,2984,"₹20,999","₹14,999",8 GB RAM,128 GB ROM,16.71 cm (6.58 inch),,50MP + 2MP,8MP Front Camera,5000 mAh Battery,Dimensity 6020 Processor
2,vivo T2x 5G Aurora Gold,50584,2984,"₹20,999","₹14,999",8 GB RAM,128 GB ROM,16.71 cm (6.58 inch),,50MP + 2MP,8MP Front Camera,5000 mAh Battery,Dimensity 6020 Processor
3,vivo T2x 5G Glimmer Black,250618,13734,"₹18,999","₹12,999",6 GB RAM,128 GB ROM,16.71 cm (6.58 inch),,50MP + 2MP,8MP Front Camera,5000 mAh Battery,Dimensity 6020 Processor
4,vivo T2x 5G Marine Blue,130651,6940,"₹17,999","₹11,999",4 GB RAM,128 GB ROM,16.71 cm (6.58 inch),,50MP + 2MP,8MP Front Camera,5000 mAh Battery,Dimensity 6020 Processor
...,...,...,...,...,...,...,...,...,...,...,...,...,...
691,LAVA A1,20705,1923,,"₹1,049",4 MB RAM,24 MB ROM,4.5 cm (1.77 inch) Display,yes,0.3MP Rear Camera,,800 mAh Battery,
692,Tecno Spark Go 2023 UYUNI BLUE,934,59,"₹8,299","₹6,495",3 GB RAM,32+3 GB ROM,16.66 cm (6.56 inch) Display,,13MP Rear Camera,,5000 mAh Battery,
693,IQOO Z7 Pro 5G Blue Lagoon,173,12,"₹31,999","₹27,900",8 GB RAM,256 GB ROM,17.22 cm (6.78 inch) Display,,64MP Rear Camera,,4600 mAh Battery,
694,Nokia 106 4G Keypad Mobile,949,80,"₹2,999","₹2,339",32 MB RAM,32 MB ROM,4.5 cm (1.77 inch) Display,,0MP,0MP Front Camera,1450 mAh Battery,Unisoc T107 Processor


In [79]:
# save the dataframe as csv file
flipkart_mobiles.to_csv(r'C:\Users\LENOVO\Desktop\webscrapping\flipkart_mobiles.csv', index = False)