Skip to content

This project is the first part of a two-part series. In the first part, you will blend and format data and deal with outliers. For the second part, you will use your cleaned up dataset to create another linear regression model.

Notifications You must be signed in to change notification settings

gmalekar/Creating-Analytical-Dataset-LinearRegression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Cleaning-Dataset-For-LinearRegression

This project is the first part of a two-part series. In the first part, you will blend and format data and deal with outliers. For the second part, you will use your cleaned up dataset to create another linear regression model.

The Business Problem

Pawdacity is a leading pet store chain in Wyoming with 13 stores throughout the state. This year, Pawdacity would like to expand and open a 14th store. Your manager has asked you to perform an analysis to recommend the city for Pawdacity’s newest store, based on predicted yearly sales.

Your first step in predicting yearly sales is to first format and blend together data from different datasets and deal with outliers.

We have the following information to work with:

The monthly sales data for all of the Pawdacity stores for the year 2010. NAICS data on the most current sales of all competitor stores where total sales is equal to 12 months of sales. A partially parsed data file that can be used for population numbers. Demographic data (Households with individuals under 18, Land Area, Population Density, and Total Families) for each city and county in the state of Wyoming. For people who are unfamiliar with the US city system, a state contains counties and counties contains one or more cities.

Please refer to this link for the power query I used to clean and merge the data.

https://github.com/gmalekar/Cleaning-Dataset-For-LinearRegression/blob/master/DatasetCleaning.pbix

Please refer to this for the solution.

https://github.com/gmalekar/Cleaning-Dataset-For-LinearRegression/blob/master/Cleaning%20Data%20Girish.pdf

About

This project is the first part of a two-part series. In the first part, you will blend and format data and deal with outliers. For the second part, you will use your cleaned up dataset to create another linear regression model.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published