# Vestiaire Collective

![vestiaire](../Images/vestiaire_collective.jpg)

## User Behaviour Analysis

#### Introduction
In this notebook a user behavior analysis will be performed on the user data of a fashion e-commerce company, the data represents the inactivity of registered users.

At the beginning of this notebook, you will find related information about the company, the problem description and information about the used data. Next, the data is cleaned and prepared for analysis using Python and Pandas. Following this, by using a local MySQL installation and a MySQL Python Connector, a database is created, and the relevant data is uploaded for storage. Then, using SQL queries the data is extracted and saved on a CSV file for analysis. Finally, the data is analyzed, and insights are extracted using Tableau.

#### Company's Description

Vestiaire Collective is an iconic French tech unicorn with a business model focused on fashion and environmental sustainability. The company was founded in Paris in 2009 by Fanny Moizant and Sophie Hersan, since then Vestiaire Collective has become the leading online C2C marketplace for pre-owned designer clothing and accessories. 

As of 2022, the company will reach 23 million members, possess a catalogue of 5 million items and a Gross Merchandise Value (GMV) exceeding 1 billion dollars, the reach of Vestiaire is global and rapidly expanding.

The company business model consists in a circular economy model where users buy and sell pre-owned designer clothes, this allows buyers to acquire a high-quality product at a lower price and sellers to gain a higher profit on used clothing. Building trust is essential for this type of business, for this to happen, every product sold is sent to the company's warehouse where a professional team ensures that the quality and current fashion standard are satisfied. Vestiaire Collective business model targets millennials and gen z customers, market segments that have had a growing concern on sustainability.

#### Problem Description

The objective of this analysis is to provide actionable recommendations in order to reduce the user inactivity and to improve user retention. During the analysis it is important to understand and categorize users that have been inactive on the platform from 11 to 365 days.

The following question is what needs to be answered when performing the exploratory data analysis:

How can the user inactivity be reduced?

#### Dataset Description
The data that will be used for the analysis is stored in a CSV file called "users-dataset.csv", the data depicts the first months of 2019. 

The dataset contains structured data with 98000 observations and twenty four variables:
- **identifierHash** - Hash of the user's id.
- **type** - The type of entity.
- **country** - User's country (written in French).
- **language** - The user's preferred language.
- **socialNbFollowers** - Number of users who subscribed to this user's activity. New accounts are automatically followed by the store's official accounts.
- **socialNbFollows** - Number of user account this user follows. New accounts are automatically assigned to follow the official partners.
- **socialProductsLiked** - Number of products this user liked.
- **productsListed** - Number of currently unsold products that this user has uploaded.
- **productsSold** - Number of products this user has sold.
- **productsPassRate** - % of products meeting the product description.
- **productsWished** - Number of products this user added to his/her wish list.
- **productsBought** - Number of products this user bought.
- **gender** - User's gender.
- **civilityGenderId** - Civility as integer.
- **civilityTitle** - Civility title.
- **hasAnyApp** - User has ever used any of the store's official app.
- **hasAndroidApp** - User has ever used the official Android app.
- **hasIosApp** - User has ever used the official iOS app.
- **hasProfilePicture** - User has a custom profile picture.
- **daysSinceLastLogin** - Number of days since the last login.
- **seniority** - Number of days since the user registered.
- **seniorityAsMonths** - Seniority in months.
- **seniorityAsYears** - Seniority in years.
- **countryCode** - User's country (ISO-3166-1).

#### Information about the data
The data was mined from the company's website and made publicly available by [Jeffrey Mvutu Mabilama](https://www.kaggle.com/datasets/jmmvutu/ecommerce-users-of-a-french-c2c-fashion-store/code?sort=votes&select=6M-0K-99K.users.dataset.public.csv) in Kaggle under the following [license](https://creativecommons.org/licenses/by-nc-sa/4.0/).

The data was previously manipulated and ranked, the personal account identification has been hashed by the owner of the dataset in order to protect the privacy of the users.

#### Step 1 - Data Cleaning
At the end of this notebook the data will be analyzed for insights using Tableau, in order to improve the reliability and storage of the data, Python and Pandas will be used to clean the data.

The required libraries and dataset are imported to the notebook.

In [1]:
import pandas as pd
import numpy as np
import pycountry
import csv
from mysql.connector import connect, Error
from getpass import getpass
from pathlib import Path

In [2]:
users_data = pd.read_csv("../Data/users-dataset.csv")
users_data.head(5)

Unnamed: 0,identifierHash,type,country,language,socialNbFollowers,socialNbFollows,socialProductsLiked,productsListed,productsSold,productsPassRate,...,civilityTitle,hasAnyApp,hasAndroidApp,hasIosApp,hasProfilePicture,daysSinceLastLogin,seniority,seniorityAsMonths,seniorityAsYears,countryCode
0,-1097895247965112460,user,Royaume-Uni,en,147,10,77,26,174,74.0,...,mr,True,False,True,True,11,3196,106.53,8.88,gb
1,2347567364561867620,user,Monaco,en,167,8,2,19,170,99.0,...,mrs,True,False,True,True,12,3204,106.8,8.9,mc
2,6870940546848049750,user,France,fr,137,13,60,33,163,94.0,...,mrs,True,False,True,False,11,3203,106.77,8.9,fr
3,-4640272621319568052,user,Etats-Unis,en,131,10,14,122,152,92.0,...,mrs,True,False,True,False,12,3198,106.6,8.88,us
4,-5175830994878542658,user,Etats-Unis,en,167,8,0,25,125,100.0,...,mrs,False,False,False,True,22,2854,95.13,7.93,us


Duplicated entries are searched.

In [3]:
users_data["identifierHash"].duplicated().value_counts()

False    98913
Name: identifierHash, dtype: int64

The dataset has no duplicated entries, following this, the variables not needed for the analysis are dropped and the columns are renamed.

In [4]:
columns_to_drop = [
    "country", "type", "civilityGenderId", "civilityTitle", 
    "seniorityAsMonths", "seniorityAsYears"
]
columns_to_rename = {
    "identifierHash": "customer_id",
    "index": "account_id",
    "socialNbFollowers": "followers",
    "socialNbFollows": "follows",
    "socialProductsLiked": "products_liked",
    "productsListed": "products_listed",
    "productsSold": "products_sold",
    "productsPassRate": "products_passrate",
    "productsWished": "products_wished",
    "productsBought": "products_bought",
    "hasAnyApp": "uses_any_app",
    "hasAndroidApp": "uses_android_app",
    "hasIosApp": "uses_ios_app",
    "hasProfilePicture": "profile_picture",
    "daysSinceLastLogin": "days_since_last_login",
    "seniority": "days_since_registration",
    "countryCode": "country_code"
}

users_data_v2 = users_data.reset_index().drop(columns_to_drop, axis=1).rename(columns = columns_to_rename)
users_data_v2.head(5)

Unnamed: 0,account_id,customer_id,language,followers,follows,products_liked,products_listed,products_sold,products_passrate,products_wished,products_bought,gender,uses_any_app,uses_android_app,uses_ios_app,profile_picture,days_since_last_login,days_since_registration,country_code
0,0,-1097895247965112460,en,147,10,77,26,174,74.0,104,1,M,True,False,True,True,11,3196,gb
1,1,2347567364561867620,en,167,8,2,19,170,99.0,0,0,F,True,False,True,True,12,3204,mc
2,2,6870940546848049750,fr,137,13,60,33,163,94.0,10,3,F,True,False,True,False,11,3203,fr
3,3,-4640272621319568052,en,131,10,14,122,152,92.0,7,0,F,True,False,True,False,12,3198,us
4,4,-5175830994878542658,en,167,8,0,25,125,100.0,0,0,F,False,False,False,True,22,2854,us


The dataset needs to be further inspected, the datatype of the variables are explored:

In [5]:
users_data_v2.dtypes

account_id                   int64
customer_id                  int64
language                    object
followers                    int64
follows                      int64
products_liked               int64
products_listed              int64
products_sold                int64
products_passrate          float64
products_wished              int64
products_bought              int64
gender                      object
uses_any_app                  bool
uses_android_app              bool
uses_ios_app                  bool
profile_picture               bool
days_since_last_login        int64
days_since_registration      int64
country_code                object
dtype: object

The data variables have the correct data type. Next, every variable is searched for erroneous or duplicated data entry.

In [6]:
users_data_v2.nunique()

account_id                 98913
customer_id                98913
language                       5
followers                     90
follows                       85
products_liked               420
products_listed               65
products_sold                 75
products_passrate             72
products_wished              279
products_bought               70
gender                         2
uses_any_app                   2
uses_android_app               2
uses_ios_app                   2
profile_picture                2
days_since_last_login        699
days_since_registration       19
country_code                 199
dtype: int64

In [7]:
columns_to_inspect = [
    "language", "gender", "uses_any_app", "uses_android_app", "uses_ios_app", 
    "profile_picture"
]
users_data_v2.apply(lambda x: x.unique())

account_id                 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
customer_id                [-1097895247965112460, 2347567364561867620, 68...
language                                                [en, fr, de, it, es]
followers                  [147, 167, 137, 131, 130, 121, 53, 744, 57, 12...
follows                    [10, 8, 13, 12, 0, 9, 13764, 40, 19, 16, 60, 3...
products_liked             [77, 2, 60, 14, 0, 1, 1140, 3, 51671, 45, 863,...
products_listed            [26, 19, 33, 122, 25, 47, 31, 5, 0, 123, 40, 6...
products_sold              [174, 170, 163, 152, 125, 123, 108, 106, 104, ...
products_passrate          [74.0, 99.0, 94.0, 92.0, 100.0, 91.0, 98.0, 85...
products_wished            [104, 0, 10, 7, 531, 1842, 6, 68, 564, 2, 1016...
products_bought            [1, 0, 3, 105, 2, 36, 32, 14, 115, 6, 8, 69, 1...
gender                                                                [M, F]
uses_any_app                                                   [True, False]

The data has no erroneous entries. An analysis of the country code variable indicates that there are two codes not in compliance with the ISO code, these are removed from the dataset.

In [8]:
for code in users_data_v2["country_code"]:
    try:
        country_name = pycountry.countries.get(alpha_2 = code).name
    except AttributeError:
        print(code)

ic
an
ic


In [9]:
values_to_drop = users_data_v2[users_data_v2["country_code"].isin(["ic", "an"])]
users_data_v3 = users_data_v2.drop(values_to_drop.index)

A new variable that will contain the full name of the countries is created. The pycountry library is used for this purpose.

In [10]:
country = []
for code in users_data_v3["country_code"]:
    country_name = pycountry.countries.get(alpha_2 = code).name
    country.append(country_name)

users_data_v3["country"] = country
users_data_v3.head()

Unnamed: 0,account_id,customer_id,language,followers,follows,products_liked,products_listed,products_sold,products_passrate,products_wished,products_bought,gender,uses_any_app,uses_android_app,uses_ios_app,profile_picture,days_since_last_login,days_since_registration,country_code,country
0,0,-1097895247965112460,en,147,10,77,26,174,74.0,104,1,M,True,False,True,True,11,3196,gb,United Kingdom
1,1,2347567364561867620,en,167,8,2,19,170,99.0,0,0,F,True,False,True,True,12,3204,mc,Monaco
2,2,6870940546848049750,fr,137,13,60,33,163,94.0,10,3,F,True,False,True,False,11,3203,fr,France
3,3,-4640272621319568052,en,131,10,14,122,152,92.0,7,0,F,True,False,True,False,12,3198,us,United States
4,4,-5175830994878542658,en,167,8,0,25,125,100.0,0,0,F,False,False,False,True,22,2854,us,United States


#### Step 2 - Database Storage
The dataset has been cleaned and it is now ready to be saved to the database. 

Steps to follow: 

1. In order to save the data to a database, MySQL Server is downloaded and installed in the local machine. A database schema is required for the storage of the data, for this reason a schema in compliance with the data normalization was designed.

2. The cleaned dataset is separated in tables (according to the database schema).

3. Using a MySQL Python Connector, the necessary database and tables are created with SQL queries.

4. The data is uploaded to the database using the connector.

5. Finally, the database is queried, and the information needed to answer the problem description is extracted in a CSV file.

##### 1. Database Schema
The following database schema was designed in order to help with the database

![database_schema](../DB/Schema/schema.png)

##### 2. Table Creation
Following the database schema, the cleaned dataset are separated in tables and saved as a CSV file

In [11]:
country = users_data_v3.loc[:, ["country_code", "country"]].apply(lambda x: x.unique())
user_info = users_data_v3.loc[:, ["customer_id", "account_id", "language", "gender", "country_code"]]
user_account = users_data_v3.loc[:, ["account_id", "followers", "follows", "profile_picture"]].replace({True: 1, False: 0})
login_info = users_data_v3.loc[:, ["account_id", "days_since_last_login", "days_since_registration"]]
products = users_data_v3.loc[:, ["account_id", "products_liked", "products_listed", "products_sold", 
                                "products_passrate", "products_wished", "products_bought"]]
app_type = pd.DataFrame(["android", "ios"]).reset_index().rename(columns={"index": "app_id", 0: "app_type"})

to_melt = users_data_v3.loc[:, ["account_id", "uses_android_app", "uses_ios_app"]]
app = to_melt.melt(id_vars=["account_id"], value_vars=["uses_android_app", "uses_ios_app"], var_name="app_id", value_name="uses_app")
app = app.replace({"uses_android_app": 0, "uses_ios_app": 1, True: 1, False: 0})

In [12]:
# Pathlib is used to help solve operating system pathing issues
p = Path("../DB/Db_Tables/ ")
p = str(p).strip()
tables_to_save = [country, user_account, user_info, login_info, products, app_type, app]
naming_tables = ["country", "user_account", "user_info", "login_info", "products", "app_type", "app"]
i = 0

for table in tables_to_save:
    table.to_csv(f"{p}{naming_tables[i]}.csv", index=False, header=False)
    i += 1

##### 3. Database and Table Creation
Using the MySQL Python Connector, a database and the related tables are created. In order to reproduce this code, MySQL Server must be installed in your local machine and a valid username and password must be entered in each cell.

In [22]:
# Query that will be executed on the MySQL server
create_db_query = """
CREATE DATABASE vestiare_collective
"""

try:
    # Establishing a connection with the server
    with connect(
        host="localhost",
        user=input("Enter username: "),
        password=getpass("Enter password: "),
    ) as connection:
        # Actions performed in the server
        with connection.cursor() as cursor:
            cursor.execute(create_db_query)
except Error as e:
    print(e)

In [23]:
create_country_query = """
CREATE TABLE country(
    country_code VARCHAR(4) PRIMARY KEY,
    country VARCHAR(50)
);
"""

create_user_account_query = """
CREATE TABLE user_account(
    account_id MEDIUMINT PRIMARY KEY,
    followers SMALLINT,
    follows SMALLINT,
    profile_picture TINYINT
);
"""

create_user_info_query = """
CREATE TABLE user_info(
    customer_id BIGINT PRIMARY KEY,
    account_id MEDIUMINT,
    language VARCHAR(4),
    gender VARCHAR(1),
    country_code VARCHAR(4),
    FOREIGN KEY(country_code) REFERENCES country(country_code),
    FOREIGN KEY(account_id) REFERENCES user_account(account_id)
);
"""

create_app_type_query = """
CREATE TABLE app_type(
    app_id MEDIUMINT PRIMARY KEY,
    app_type VARCHAR(20)
);
"""

create_app_query = """
CREATE TABLE app(
    account_id MEDIUMINT,
    app_id MEDIUMINT,
    uses_app TINYINT,
    FOREIGN KEY(account_id) REFERENCES user_account(account_id),
    FOREIGN KEY(app_id) REFERENCES app_type(app_id),
    PRIMARY KEY(account_id, app_id)
);
"""

create_login_info_query = """
CREATE TABLE login_info(
    account_id MEDIUMINT,
    days_since_last_login SMALLINT,
    days_since_registration SMALLINT,
    FOREIGN KEY(account_id) REFERENCES user_account(account_id),
    PRIMARY KEY(account_id)
);
"""

create_products_query = """
CREATE TABLE products(
    account_id MEDIUMINT,
    products_liked MEDIUMINT,
    products_listed SMALLINT,
    products_sold SMALLINT,
    products_passrate FLOAT,
    products_wished SMALLINT,
    products_bought SMALLINT,
    FOREIGN KEY(account_id) REFERENCES user_account(account_id),
    PRIMARY KEY(account_id)
);
"""

try:
    with connect(
        host="localhost",
        user=input("Enter username: "),
        password=getpass("Enter password: "),
        database="vestiare_collective"
    ) as connection:
        with connection.cursor() as cursor:
            cursor.execute(create_country_query)
            cursor.execute(create_user_account_query)
            cursor.execute(create_user_info_query)
            cursor.execute(create_app_type_query)
            cursor.execute(create_app_query)
            cursor.execute(create_login_info_query)
            cursor.execute(create_products_query)
            connection.commit()
except Error as e:
    print(e)

##### 4. Data Upload
After creating the database and the tables, the data is uploaded.

In [24]:
insert_country_query = """
INSERT INTO country
(country_code, country)
VALUES (%s, %s);
"""

insert_user_info_query = """
INSERT INTO user_info
(customer_id, account_id, language, gender, country_code)
VALUES (%s, %s, %s, %s, %s);
"""

insert_user_account_query = """
INSERT INTO user_account
(account_id, followers, follows, profile_picture)
VALUES (%s, %s, %s, %s);
"""

insert_app_type_query = """
INSERT INTO app_type
(app_id, app_type)
VALUES(%s, %s);
"""

insert_app_query = """
INSERT INTO app
(account_id, app_id, uses_app)
VALUES(%s, %s, %s);
"""

insert_login_info_query = """
INSERT INTO login_info
(account_id, days_since_last_login, days_since_registration)
VALUES(%s, %s, %s);
"""

insert_products_query = """
INSERT INTO products
(account_id, products_liked, products_listed, products_sold, 
products_passrate, products_wished, products_bought)
VALUES(%s, %s, %s, %s, %s, %s, %s);
"""

#Remember naming_tables
#naming_tables = ["country", "user_account", "user_info", "login_info", "products", "app_type", "app"]

queries_to_insert = [
    insert_country_query,
    insert_user_account_query,
    insert_user_info_query, 
    insert_login_info_query,
    insert_products_query,
    insert_app_type_query,
    insert_app_query
]

try:
    with connect(
        host="localhost",
        user=input("Enter username: "),
        password=getpass("Enter password: "),
        database="vestiare_collective"
    ) as connection:
            with connection.cursor() as cursor:
                i = 0
                for query in queries_to_insert:
                    with open(str(p).strip() + naming_tables[i] + ".csv") as file:
                        reader = csv.reader(file)
                        for row in reader:
                            cursor.execute(query, row)
                    i += 1
                connection.commit()
except Error as e:
    print(e)

##### 5. Data Extraction
The required data for the analysis is extracted using queries and saved as a CSV file.

In [25]:
query = """
SELECT *
FROM user_account;
"""

try:
    with connect(
        host="localhost",
        user=input("Enter username: "),
        password=getpass("Enter password: "),
        database="vestiare_collective"
    ) as connection:
            with connection.cursor() as cursor:
                cursor.execute(query)
                columns = cursor.description
                data = cursor.fetchall()
except Error as e:
    print(e)

column_names = []
for column in columns:
    column_names.append(column[0])
pd.DataFrame(data, columns=column_names)

Unnamed: 0,account_id,followers,follows,profile_picture
0,0,147,10,1
1,1,167,8,1
2,2,137,13,0
3,3,131,10,0
4,4,167,8,1
...,...,...,...,...
98905,98908,3,8,1
98906,98909,3,8,1
98907,98910,3,8,1
98908,98911,3,8,1


In [26]:
query = """
SELECT
user_account.account_id,
login_info.days_since_last_login,
login_info.days_since_registration,
user_info.gender,
user_info.country_code,
user_account.followers,
user_account.follows,
user_account.profile_picture,
products.products_liked,
products.products_listed,
products.products_sold,
products.products_passrate,
products.products_wished,
products.products_bought

FROM user_account
LEFT JOIN user_info ON user_account.account_id = user_info.account_id
LEFT JOIN login_info ON user_account.account_id = login_info.account_id
LEFT JOIN products ON user_account.account_id = products.account_id
"""

try:
    with connect(
        host="localhost",
        user=input("Enter username: "),
        password=getpass("Enter password: "),
        database="vestiare_collective"
    ) as connection:
            with connection.cursor() as cursor:
                cursor.execute(query)
                columns = cursor.description
                data = cursor.fetchall()
except Error as e:
    print(e)

column_names = []
for column in columns:
    column_names.append(column[0])
df = pd.DataFrame(data, columns=column_names)

A variable is to be cleaned and joined to the data in order to be used for the analysis

In [27]:
query = """
SELECT account_id, app.uses_app, app_type.app_type
FROM app
JOIN app_type ON app.app_id = app_type.app_id
"""

try:
    with connect(
        host="localhost",
        user=input("Enter username: "),
        password=getpass("Enter password: "),
        database="vestiare_collective"
    ) as connection:
            with connection.cursor() as cursor:
                cursor.execute(query)
                columns = cursor.description
                data = cursor.fetchall()
except Error as e:
    print(e)

column_names = []
for column in columns:
    column_names.append(column[0])
to_pivot = pd.DataFrame(data, columns=column_names)

In [28]:
to_merge = to_pivot.pivot(columns="app_type", index="account_id", values="uses_app").reset_index()
df = df.merge(to_merge, on="account_id")
df

Unnamed: 0,account_id,days_since_last_login,days_since_registration,gender,country_code,followers,follows,profile_picture,products_liked,products_listed,products_sold,products_passrate,products_wished,products_bought,android,ios
0,0,11,3196,M,gb,147,10,1,77,26,174,74.0,104,1,0,1
1,1,12,3204,F,mc,167,8,1,2,19,170,99.0,0,0,0,1
2,2,11,3203,F,fr,137,13,0,60,33,163,94.0,10,3,0,1
3,3,12,3198,F,us,131,10,0,14,122,152,92.0,7,0,0,1
4,4,22,2854,F,us,167,8,1,0,25,125,100.0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
98905,98908,708,3204,M,us,3,8,1,0,0,0,0.0,0,0,0,0
98906,98909,695,3204,M,fr,3,8,1,0,0,0,0.0,0,0,0,1
98907,98910,520,3204,M,be,3,8,1,0,0,0,0.0,0,0,1,0
98908,98911,267,3204,F,it,3,8,1,0,0,0,0.0,0,0,0,0


In [29]:
p = Path("../Tableau_Data/ ")
df.to_csv(f"{p}".strip() + "user_retention_data.csv", index=False)

In [30]:
df

Unnamed: 0,account_id,days_since_last_login,days_since_registration,gender,country_code,followers,follows,profile_picture,products_liked,products_listed,products_sold,products_passrate,products_wished,products_bought,android,ios
0,0,11,3196,M,gb,147,10,1,77,26,174,74.0,104,1,0,1
1,1,12,3204,F,mc,167,8,1,2,19,170,99.0,0,0,0,1
2,2,11,3203,F,fr,137,13,0,60,33,163,94.0,10,3,0,1
3,3,12,3198,F,us,131,10,0,14,122,152,92.0,7,0,0,1
4,4,22,2854,F,us,167,8,1,0,25,125,100.0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
98905,98908,708,3204,M,us,3,8,1,0,0,0,0.0,0,0,0,0
98906,98909,695,3204,M,fr,3,8,1,0,0,0,0.0,0,0,0,1
98907,98910,520,3204,M,be,3,8,1,0,0,0,0.0,0,0,1,0
98908,98911,267,3204,F,it,3,8,1,0,0,0,0.0,0,0,0,0


#### Step 3 - Tableau Analysis
The business intelligence software Tableau will be used to generate insights and visualizations regarding the user inactivity, the exploratory data analysis will start by visualizing the user inactivity distribution in a year. 

For the purpose of this analysis, two user segments are identified: 
- Buyers: Users that have not sold a product
- Sellers: Users with at least one product sold

The data will be filtered by user and from the earliest day since last login (day 11) until the day 365. In addition, it will count the accounts according to their last login.

![one_year](../Tableau_Analysis/one_year.png)

It can be seen in the **one year inactivity distribution** graph that the overall total user interest in the platform decreases exponentially, this user behavior is expected, some reasons behind this could be: 
- New Marketing campaigns attracted new users; the inactivity could represent a lack of interest.
- Active users interest for the platform is reduced after some time.
- Active users could be changing platforms
- Active users take breaks or buy by the season

In order to better understand the user inactivity, further customer segmentation needs to be performed. 

Buyers are most of the population with a ratio of 90% buyers and 10% sellers. The **one year inactivity distribution** graph shows that two time-constrained segments are present within the data: a short-term user inactivity, where potential active users and the exponential decrease behavior are present; and a long-term user inactivity, which may contain users with a lack of interest.

The average of the distribution, represented by the red line with a value of 45, will be used to distinguish between the two time-segments. The short term will be defined from day 11 until day 45, the long term will be considered from day 46 until day 365.

To continue the analysis, the short-term period will be looked.

![short](../Tableau_Analysis/short.png)

The ratio between buyers and sellers has changed to 80% buyers and 20% sellers, this indicates that the seller participation has increased over time. **Buyers - short term inactivity** and **sellers - short term inactivity** graphs shows that user inactivity stops decreasing exponentially after a period of time, for buyers the balance in user inactivity is reached around the day 45 and for sellers the value is around the day 20, from this it can be concluded that buyers, on average, lose interest in the platform at a slower rate than sellers.

To continue with the analysis, the long-term user inactivity will be analyzed

![long](../Tableau_Analysis/long.png)

The buyers - sellers ratio has changed to 95% buyers and 5% sellers, an increase in buyers and a decrease in sellers in relation to the year proportion. This indicates that most of the long-term inactivity is represented by buyer users. From the population distribution, it can be observed that a noise type of pattern has been reached, which confirms a stable user inactivity. 

In order to be able to predict which users are more likely to become a short term or long-term user, further analysis will be made in order to identify the most important characteristics for buyers and sellers, for this purpose a default account analysis will be performed, this consists in observing how the proportion of default accounts (accounts with no user interaction in different categories) change from long to short term.

![default_accounts](../Tableau_Analysis/default_accounts.png)

It can be observed that buyers and sellers differ in category priority. For buyers, users that have a higher activity in categories such as products liked, products wished, and products bought tend to become a short-term user. For sellers, the most important indicators that predict if a user will be part of the short-term segment are products liked, followers and pass rate.

Due to the importance that buyers have on the success of the platform; the rest of the analysis will be focused on them. To continue, a deeper look will be made to the identified key categories.

![buyer_overview](../Tableau_Analysis/buyer_overview.png)

Because the variable days since registration are grouped together in intervals, it can be determined that the data that is being analyzed correspond to the outcome of two marketing campaigns. This was expected due to the exponential behavior observed.

The buyers products liked characteristic will be analyzed in more depth, the analysis of products liked is separated in the following categories: gender, profile picture, iOS, android and country code

![buyer_products_liked_comparison](../Tableau_Analysis/buyer_products_liked_comparison.png)

Products Liked Analysis:
- Gender - Female users interacts more often in this category.
- Profile Picture - Users with no profile picture likes products the most.
- IOS & Android - Users that use the official iOS app are more likely to give a like to products.
- Top 5 Countries - Users from Denmark gives likes to the most products.

The order of category that is most important to the products liked indicator is as follows:

Profile Picture > iOS > Gender > Android

1. It can be deduced that in order to improve the products liked indicator, the most optimal marketing strategy to implement involves attracting female users that live in Denmark and that uses iOS devices.

![buyer_products_wished_comparison](../Tableau_Analysis/buyer_products_wished_comparison.png)

Products Wished Analysis:
- Gender - Female users are more inclined to add products to their wish list.
- Profile Picture - Users with no profile picture tend to add more products to their wish list.
- IOS & Android - Users that interact with the platform through the iOS app are more inclined to include products in their wish list.
- Top 5 Countries - Deutschland users include products to their wish list the most.

The order of category that is most important to the products liked indicator is as follows:

Profile Picture > iOS > Gender > Android

1. It can be deduced that in order to improve the products wished indicator, the most optimal marketing strategy to implement involves attracting female users that live in Deutschland and that uses iOS devices.

![buyer_products_bought_comparison](../Tableau_Analysis/buyer_products_bought_comparison.png)

The most important indicator for the success of the business is **Products Bought**, user interaction within the platform is a good indicator of growth but the act of buying a product is makes the business profitable.

Products Bought Analysis:
- Gender - Female and male users buy on average the same quantity of items.
- Profile Picture - Users with no profile picture tend to buy more products.
- IOS & Android - Customers using the iOS app are more inclined to purchase additional items.
- Top 8 Countries - Deutschland and Sweden are the two most promising countries in term of products bought by user.

The order of category that is most important to the products liked indicator is as follows:
iOS > Profile Picture > Android > Gender

1. It can be deduced that in order to generate more sales, the most optimal marketing strategy involves attracting male and female user interaction through iOS and android app, the campaign should be focused on Deutschland and Sweden.

2. Most of the buyers are represented by women, surprisingly, men on average tend to make purchases in the same amount as women. A recommendation to improve the sales figures involves in implementing a marketing campaign that targets male users only.

3. Deutschland users appears to be fans of the platform, a marketing campaign focused only on Deutschland would be a good idea.

4. Although Sweden is one of the top countries that buys products from Vestiaire Collective, it didn't appeared on the products liked or products wished indicators, this could be interpreted as a successful emerging market. Marketing campaigns targeting Sweden user interaction should be prioritized.

#### Conclusion
The analyzed data corresponds to two specific marketing campaigns. It can be determined that users that haven't logged back after eleven days can be classified into two time-segments: short term, which spans from day 11 until day 45 and long term, which goes from day 46 until the day 365. Users can also be divided in two segments: buyers and sellers, both type of users show signs of exponential disinterest after a couple of days, this could be happening due to several reasons: 

- New Marketing campaigns attracted new users; the inactivity could represent a lack of interest.
- Active users interest for the platform is reduced after some time.
- Active users could be changing platforms
- Active users take breaks or buy by the season

Buyers represent most of the population in both short- and long-term categories. Short term and long-term users can be predicted by using certain indicators, for buyers the indicators are products liked, products wished, and products bought; for sellers the indicators are products liked, followers and pass rate. By performing a deeper analysis on the buyers indicators, the following insights and recommendations about the buyers were produced:

1. The most optimal marketing strategy in order to improve the products liked indicator involves attracting female users that live in Denmark and that uses iOS devices.
2. The most optimal marketing strategy in order to improve the products wished indicator involves attracting female users that live in Deutschland and that uses iOS devices.
3. In order to generate more sales, the most optimal marketing strategy involves attracting male and female user interaction through iOS and android app, the campaign should be focused on Deutschland and Sweden.
4. Design and apply a marketing campaign that focuses on men.
5. Design and apply a marketing campaign focused only on Deutschland.
6. Design and apply a marketing campaign focused on user signup and interaction for the people in Sweden.

#### References
https://www.independent.co.uk/life-style/fashion/vestiaire-collective-the-preowned-fashion-site-to-know-about-a7958901.html

https://www.eurazeo.com/en/vestiaire-collective-0

https://internetretailing.net/marketplaces/marketplaces/vestiaire-collective-expands-its-community-to-23m-though-tie-up-with-tradesy-in-the-us-24540

https://thenextweb.com/news/vestiaire-collectives-ceo-self-disrupt-to-keep-up-with-customers-evolving-needs-and-wants#:~:text=Vestiaire%20Collective's%20target%20customers%20are,resale%20platforms%2C%E2%80%9D%20he%20notes.