# Mobile Phone Analysis

Analysis of phone characteristics and popularity of different phone models collected on 14th of March

In [290]:
import pandas as pd
import plotly.express as px

In [291]:
df = pd.read_csv('../clean_phones/relevant_features.csv')
df = df[df['type'] == 'phone']
df.head()

Unnamed: 0,brand,model,photo_link,phone_link,popularity_become_fan,popularity_views,popularity_views_today,price,eSIM,announce_year,...,radio,usb_type,usb_version,biometric_auth,has_black_color,foldable,battery_type,battery_capacity,type,5g
0,alcatel,1B (2022),https://fdn2.gsmarena.com/vv/bigpic/alcatel-1b...,https://www.gsmarena.com/alcatel_1b_(2022)-117...,18,252662,0.3,100.0,False,2022.0,...,True,microUSB,2.0,False,True,False,Li-Ion,3000.0,phone,False
1,alcatel,1L Pro (2021),https://fdn2.gsmarena.com/vv/bigpic/alcatel-1l...,https://www.gsmarena.com/alcatel_1l_pro_(2021)...,14,207311,0.1,110.0,False,2021.0,...,True,microUSB,2.0,True,False,False,Li-Ion,3000.0,phone,False
2,alcatel,1 (2021),https://fdn2.gsmarena.com/vv/bigpic/alcatel-1-...,https://www.gsmarena.com/alcatel_1_(2021)-1098...,19,238822,0.1,60.0,False,2021.0,...,True,microUSB,2.0,False,True,False,Li-Ion,2000.0,phone,False
3,alcatel,3L (2021),https://fdn2.gsmarena.com/vv/bigpic/alcatel-a3...,https://www.gsmarena.com/alcatel_3l_(2021)-106...,20,218264,0.0,330.0,False,2021.0,...,True,microUSB,2.0,True,True,False,Li-Po,4000.0,phone,False
4,alcatel,1S (2021),https://fdn2.gsmarena.com/vv/bigpic/alcatel-1s...,https://www.gsmarena.com/alcatel_1s_(2021)-106...,18,259768,0.1,130.0,False,2021.0,...,True,microUSB,2.0,True,True,False,Li-Po,4000.0,phone,False


In [292]:
top_brands = df.groupby('brand')[['popularity_views']].mean('popularity_views').sort_values('popularity_views', ascending=False).reset_index().iloc[:20]['brand']
df = df[df['brand'].isin(top_brands)]

In [293]:
pdf = df.groupby('brand').size().reset_index(name="counts")
px.bar(pdf.sort_values('counts'), x='brand', y='counts')

The plot shows that Xiaomi in the years 2020-2025 has released the most amount of smartphones, followed by oppo and realme, other chinese manufacturers.

This might mean that they try to target all market segments with their phone models. This will be investigated in consequent graphs.

In [294]:
pdf = df[['price']]
px.histogram(pdf, x='price')

The most frequent price segment for phones in recent years is budget-friendly, from 75 up to 225 euros. This might imply that there is a lot of competition in budget phonne segment, with many manufacturers offering many different phones in this price range.

A long tail with low counts imply that there are a few of premium flagship phones.

In [295]:
pdf = df[['popularity_become_fan']]
px.histogram(pdf, x='popularity_become_fan', nbins=100)

Most phones recieve from 20-40 likes

In [296]:
pdf = df[['price']]
px.box(pdf, x='price')

The box plot gives insight from the price range that most phones are manufactured up to 730 euros price range, and that the remaining phones with highers prices might be considered flagships.

In [297]:
pdf = df[['price', 'brand', 'model', 'release_year']]
px.scatter(pdf, x='release_year', y='price', hover_data=['brand', 'model'])

In [298]:
pdf = df
px.density_heatmap(pdf, x='release_year', y='brand', z='popularity_become_fan', histfunc='avg')

Some of the brands, like Nothing, have recieved massive popularity increase at the start and the popularity gradually decreased over time.
Some of the brands, like Samsung, have kept the number of likes on their phones consistent throughout the years.

In [299]:
pdf = df
px.density_heatmap(pdf, x='release_year', y='brand', z='popularity_views', histfunc='avg')

Apple recieves the most amount of views on average.

In [300]:
pdf = df.groupby(['release_year', 'nfc']).size().reset_index(name="counts")
px.bar(pdf, x='release_year', y='counts', color='nfc')

In [301]:
pdf = df.groupby(['release_year', '5g']).size().reset_index(name="counts")
px.bar(pdf, x='release_year', y='counts', color='5g')

In [302]:
pdf = df.groupby(['release_year', 'usb_type']).size().reset_index(name="counts")
px.bar(pdf, x='release_year', y='counts', color='usb_type')

The technological advancment and the reduced costs to produce USB Type-C has shifted manufacturers to abolish previous connection types. Also, part of the reasong of adopting USB Type-C is due to EU regulations.

In [303]:
pdf = df.groupby(['brand', 'internal_ram_gb']).size().reset_index(name="counts")
pdf['brand_total'] = pdf.groupby('brand')['counts'].transform('sum')
pdf['proportion_within_brand'] = pdf.apply(
    lambda row: row['counts'] / row['brand_total'] if row['brand_total'] > 0 else 0,
    axis=1
).round(3)

px.bar(pdf, x='brand', y='internal_ram_gb', color='proportion_within_brand', hover_data=['counts', 'proportion_within_brand'], # Show counts and proportion on hover
    labels={'proportion_within_brand': 'Proportion within Brand'}, # Update the color bar label
    title='Internal RAM Distribution (Color Relative to Brand Total)',
    barmode='group')

In [304]:
pdf = df.groupby(['brand', 'internal_ram_gb']).size().reset_index(name="counts")
px.density_heatmap(pdf, x='internal_ram_gb', y='brand', z='counts', histfunc='avg')

In [305]:
pdf = df.groupby(['screen_resolution_x', 'screen_resolution_y']).size().reset_index(name="counts")
px.scatter(pdf, x='screen_resolution_x', y='screen_resolution_y', size='counts', size_max=100)

Most of the phones have resolution of 1080x2400, which is 9:20 aspect ratio, followed by 720x1600 resolution, which is also 9:20 aspect ratio. There are several phones which have tablet-like resolution and aspect ratios, and this phones are foldable phones as can be seen below:

In [306]:
pdf = df.groupby(['screen_resolution_x', 'screen_resolution_y', 'foldable']).size().reset_index(name="counts")
px.scatter(pdf, x='screen_resolution_x', y='screen_resolution_y', size='counts', color='foldable', size_max=100)

In [307]:
pdf = df.groupby(['ip_rating'])['popularity_become_fan'].mean().reset_index()
px.bar(pdf.sort_values('popularity_become_fan'), x='ip_rating', y='popularity_become_fan')

Phones with IP67 or IP68 water and dust protection recieve more likes on average than phones without any water and dust protection (IPXX).

Ultra-rugged indestructible phones with IP69 protection level do not recieve a lot of likes from people.

In [308]:
pdf = df
px.histogram(pdf, x='screen_size')

In [309]:
# pdf = df.groupby(['height_mm', 'length_mm', 'screen_size'])['popularity_become_fan'].mean().reset_index()
pdf = df
px.density_heatmap(pdf, x='screen_size',
                 y='brand', # Or perhaps 'width_mm' if that's the intended dimension
                 z='popularity_become_fan',
                 histfunc='avg')

Most prefered screen size lies between 6.5-7 inches.

Apple small phones with screen size of 4.5-5.5 inches recieve the same popularity as their larger models.

Google smaller phones with screen size less than 6 inches also compete with the models that have larger screen.

In [310]:
pdf = df.groupby(['screen_hz'])['popularity_become_fan'].mean().reset_index()
px.bar(pdf.sort_values('popularity_become_fan'), x='screen_hz', y='popularity_become_fan')

Phones with higher screen refresh rate are more popular than phones with basic refresh rate of 60 hz.

In [311]:
pdf = df.groupby(['screen_type', 'release_year'])['popularity_become_fan'].mean().reset_index()
px.scatter(pdf.sort_values('popularity_become_fan'), x='release_year', y='screen_type', size='popularity_become_fan', size_max=30)

LTPO OLED screen type is the most popular types of screen in a smartphone, followed by oled and amoled, while TN+Film screens are least popular.

IPS screens have plummeted in popularity over time. 

In [312]:
pdf = df
px.histogram(pdf, x='battery_capacity', nbins=10)

In [313]:
pdf = df.groupby('battery_capacity')['popularity_become_fan'].mean().reset_index()
px.histogram(pdf, x='battery_capacity', y='popularity_become_fan', nbins=40, histfunc='avg')

In [314]:
pdf = df.groupby('internal_rom_gb')['popularity_become_fan'].mean().reset_index()
pdf['internal_rom_gb'] = pdf['internal_rom_gb'].astype('str')
px.bar(pdf, x='internal_rom_gb', y='popularity_become_fan')

Phones with larger internal storage recieve more likes than phones with lower internal storage memory

In [315]:
pdf = df['camera_mp']
px.histogram(pdf)

In [316]:
pdf = df.groupby('camera_mp')['popularity_become_fan'].mean().reset_index()
px.histogram(pdf.sort_values('camera_mp'), x='camera_mp', y='popularity_become_fan', histfunc='avg', nbins=10)

In [317]:
pdf = df.groupby('camera_f')['popularity_become_fan'].mean().reset_index()
px.histogram(pdf.sort_values('camera_f'), x='camera_f', y='popularity_become_fan', histfunc='avg', nbins=10)

In [318]:
pdf = df.groupby('camera_video_resolution')['popularity_become_fan'].mean().reset_index()
px.histogram(pdf, x='camera_video_resolution', y='popularity_become_fan', histfunc='avg', nbins=10)

In [319]:
pdf = df
px.density_heatmap(pdf, x='camera_video_resolution', y='camera_video_fps', z='popularity_become_fan', histfunc='avg')

Phones with maximum video resolution of 1080p do not recieve as musch popularity as phones with 4k and higher video resolution, for which the most popular framerate is 120 fps, with average likes on phones of 642.

In [320]:
pdf = df.groupby('screen_to_body')['popularity_become_fan'].mean().reset_index()
px.histogram(pdf, x='screen_to_body', y='popularity_become_fan', histfunc='avg', nbins=100)

In [321]:
pdf = df.groupby('cancelled')['popularity_become_fan'].mean().reset_index()
px.bar(pdf, x='cancelled', y='popularity_become_fan')