## PROJECT OVERVIEW
Onyxio Company Ltd faces the challenge of accurately tracking and analyzing its revenue and expenses across various business lines.
The company aims to enhance its financial decision-making process by developing a comprehensive income statement that captures key financial metrics. 
Accurate financial reporting is crucial for maintaining profitability, managing costs, and making informed strategic decisions.



## BUSINNES PROBLEM
Onyxio Company Ltd, a diversified business, aims to enhance its financial decision-making process to improve profitability and competitiveness. 
To achieve this objective, Onyxio seeks to develop a comprehensive income statement that accurately tracks and analyzes revenue and expenses across various business lines.
By leveraging detailed businness financial data and advanced analytics, Onyxio aims to identify key revenue and expense drivers, optimize cost structures, and implement strategic financial decisions to foster long-term business growth and stability.

## PROJECT OBJECTIVE
The objective of this project is to analyze Onyxio Company Ltd’s businness financial data to understand the factors influencing revenue and expenses and develop a comprehensive income statement.
By leveraging financial analysis and predictive modeling, the project aims to:

Identify key revenue and expense drivers across different business lines.

Analyze trends and patterns in financial performance over time.

Provide actionable insights to Onyxio Company Ltd for optimizing cost structures and enhancing profitability.

Develop a detailed income statement to support strategic financial decision-making.

Improve Onyxio Company Ltd’s financial management and competitiveness in the market.

## RESEARCH QUESTIONS:
1. What are the primary sources of revenue and expenses for Onyxio Company Ltd?

2. How do different business lines contribute to the overall financial performance?

3. What are the trends in revenue and expenses over time?

4. How can Onyxio Company Ltd optimize its cost structure to improve profitability?

## DATA UNDERSTANDING
The dataset used in this project was obtained from Onyx Company’s internal financial records, which contain detailed information on revenue and expenses across various business lines. 
This dataset is highly suitable for addressing the business problem at hand of creating an accurate and comprehensive income statement. 
Contained in the dataset are:
    
Year: Year of revenue/expense.
    
Month - name: Month of revenue/expense.
    
Month -sequence: Month of revenue/expense, expressed as a number.
    
Date: Date of revenue/expense, expressed as the last day of the month.
    
Business Line: Business line generating revenue/expense (e.g., Sports Inventory, Sportswear, Nutrition & Food Supplements).
    
Amount ($): Revenue or expense amount in USD.
    
Expense Subgroup: Additional subgroups categorizing expenses associated with OPEX and COGS.
    
Income/Expense Group: Subcategory of revenue/expense (e.g., Sales, Consulting and Professional Services, Other income).
    
Income or Expense: Column indicating if the associated amount is revenue or expense.

In [2]:
# importing the required libraries
import pandas as pd
import matplotlib as plt
import seaborn as sns
import numpy as np


In [3]:
#pip install xlrd

In [6]:
# loading data
df = pd.read_csv('Onyx Data.csv',encoding='ISO-8859-1')
df

Unnamed: 0,Year,Month - name,Month -sequence,Date,Business Line,"Amount, $",Expense subgroup,Revenue / Expense Group,Revenue or expense
0,2023,January,1,1/31/2023,Nutrition and Food Supplements,153000,,Sales,Revenue
1,2023,January,1,1/31/2023,Nutrition and Food Supplements,27000,,Consulting and professional services,Revenue
2,2023,January,1,1/31/2023,Nutrition and Food Supplements,6000,,Other income,Revenue
3,2023,January,1,1/31/2023,Nutrition and Food Supplements,-15000,Rent,Opex,Expense
4,2023,January,1,1/31/2023,Nutrition and Food Supplements,-9000,Equipment,Opex,Expense
...,...,...,...,...,...,...,...,...,...
575,2023,December,1,12/31/2023,Sportswear,-4000,Packaging,COGS,Expense
576,2023,December,1,12/31/2023,Sportswear,-7000,Shipping,COGS,Expense
577,2023,December,1,12/31/2023,Sportswear,-9000,Sales,COGS,Expense
578,2023,December,1,12/31/2023,Sportswear,-110000,Labor,COGS,Expense


### Data Cleaning and Preparation

In [7]:
# check for columns
df.columns

Index(['Year', 'Month - name', 'Month -sequence', 'Date', 'Business Line',
       'Amount, $', 'Expense subgroup', 'Revenue / Expense Group',
       'Revenue or expense'],
      dtype='object')

In [8]:
#check for datatypes
df.dtypes

Year                        int64
Month - name               object
Month -sequence             int64
Date                       object
Business Line              object
Amount, $                   int64
Expense subgroup           object
Revenue / Expense Group    object
Revenue or expense         object
dtype: object

In [9]:
#Check for number of columns, column labels, column data types, memory usage, range index
#the number of cells in each column
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 580 entries, 0 to 579
Data columns (total 9 columns):
 #   Column                   Non-Null Count  Dtype 
---  ------                   --------------  ----- 
 0   Year                     580 non-null    int64 
 1   Month - name             580 non-null    object
 2   Month -sequence          580 non-null    int64 
 3   Date                     580 non-null    object
 4   Business Line            580 non-null    object
 5   Amount, $                580 non-null    int64 
 6   Expense subgroup         468 non-null    object
 7   Revenue / Expense Group  580 non-null    object
 8   Revenue or expense       580 non-null    object
dtypes: int64(3), object(6)
memory usage: 40.9+ KB


In [11]:
#changing datatypes
df['Date'] = pd.to_datetime(df['Date'])
df.head()

Unnamed: 0,Year,Month - name,Month -sequence,Date,Business Line,"Amount, $",Expense subgroup,Revenue / Expense Group,Revenue or expense
0,2023,January,1,2023-01-31,Nutrition and Food Supplements,153000,,Sales,Revenue
1,2023,January,1,2023-01-31,Nutrition and Food Supplements,27000,,Consulting and professional services,Revenue
2,2023,January,1,2023-01-31,Nutrition and Food Supplements,6000,,Other income,Revenue
3,2023,January,1,2023-01-31,Nutrition and Food Supplements,-15000,Rent,Opex,Expense
4,2023,January,1,2023-01-31,Nutrition and Food Supplements,-9000,Equipment,Opex,Expense


In [15]:
#checking for null values
df.isnull().sum()

Year                         0
Month - name                 0
Month -sequence              0
Date                         0
Business Line                0
Amount, $                    0
Expense subgroup           112
Revenue / Expense Group      0
Revenue or expense           0
dtype: int64

In [23]:
#drop the null values
df.dropna(subset=['Expense subgroup'], inplace=True)
df

Unnamed: 0,Year,Month - name,Month -sequence,Date,Business Line,"Amount, $",Expense subgroup,Revenue / Expense Group,Revenue or expense
3,2023,January,1,2023-01-31,Nutrition and Food Supplements,-15000,Rent,Opex,Expense
4,2023,January,1,2023-01-31,Nutrition and Food Supplements,-9000,Equipment,Opex,Expense
5,2023,January,1,2023-01-31,Nutrition and Food Supplements,-30000,Marketing,Opex,Expense
6,2023,January,1,2023-01-31,Nutrition and Food Supplements,-40000,Payroll,Opex,Expense
7,2023,January,1,2023-01-31,Nutrition and Food Supplements,-30000,R&D,Opex,Expense
...,...,...,...,...,...,...,...,...,...
575,2023,December,1,2023-12-31,Sportswear,-4000,Packaging,COGS,Expense
576,2023,December,1,2023-12-31,Sportswear,-7000,Shipping,COGS,Expense
577,2023,December,1,2023-12-31,Sportswear,-9000,Sales,COGS,Expense
578,2023,December,1,2023-12-31,Sportswear,-110000,Labor,COGS,Expense
