This project is the first part of a two-part series. In the first part, you will blend and format data and deal with outliers. For the second part, you will use your cleaned up dataset to create another linear regression model.
Pawdacity is a leading pet store chain in Wyoming with 13 stores throughout the state. This year, Pawdacity would like to expand and open a 14th store. Your manager has asked you to perform an analysis to recommend the city for Pawdacity’s newest store, based on predicted yearly sales.
Your first step in predicting yearly sales is to first format and blend together data from different datasets and deal with outliers.
The monthly sales data for all of the Pawdacity stores for the year 2010. NAICS data on the most current sales of all competitor stores where total sales is equal to 12 months of sales. A partially parsed data file that can be used for population numbers. Demographic data (Households with individuals under 18, Land Area, Population Density, and Total Families) for each city and county in the state of Wyoming. For people who are unfamiliar with the US city system, a state contains counties and counties contains one or more cities.
https://github.com/gmalekar/Cleaning-Dataset-For-LinearRegression/blob/master/DatasetCleaning.pbix