<a href="https://colab.research.google.com/github/S734-pixel/hello-world/blob/master/Applied_Tech_Project_96_Question_copy_v0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Instructions

#### Goal of the Project

This project is designed for you to practice and solve the activities that are based on the concepts covered in the lessons: 

  * Streamlit Widgets II
  * Multipage Streamlit App I

---

#### Getting Started:

1. Click on this link to open the Colab file for this project.

   https://colab.research.google.com/drive/1_2dMWdc8o-GTY0_ktUikE1aQD9Ia5C2d

2. Create a duplicate copy of the Colab file as described below.

  - Click on the **File menu**. A new drop-down list will appear.

   <img src='https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/lesson-0/0_file_menu.png' width=500>

  - Click on the **Save a copy in Drive** option. A duplicate copy will get created. It will open up in the new tab on your web browser.

  <img src='https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/lesson-0/1_create_colab_duplicate_copy.png' width=500>

3. After creating the duplicate copy of the notebook, please rename it in the **YYYY-MM-DD_StudentName_Project96** format.

4. Now, write your code in the prescribed code cells.


---

#### Problem Statement

In this project you need to create a web app that will display the name of the app and provide data description in the home page.

---

### Dataset Description

The Census dataset includes 32561 instances with 14 features and 1 target column which can be briefed as:

|Field|Description|
|---:|:---|
|age|age of the person, Integer.|
|work-class| employment information about the individual, Categorical.|
|fnlwgt| unknown weights, Integer.|
|education| highest level of education obtained, Categorical.|
|education-years|number of years of education, Integer.|
|marital-status| marital status of the person, Categorical.|
|occupation|job title, Categorical.|
|relationship| individual relation in the family-like wife, husband, and so on, Categorical.|
|race|Categorical.|
|sex| gender, Male, or Female.|
|capital-gain| gain from sources other than salary/wages, Integer.|
|capital-loss| loss from sources other than salary/wages, Integer.|
|hours-per-week| hours worked per week, Integer.|
|native-country| name of the native country, Categorical.|
|income-group| annual income, Categorical,  **<=50k** or **>50k**.|


**Notes:**
1. The dataset has no header row for the column name. (Can add column names manually)
2. There are invalid values in the dataset marked as **"?"**.
3. As the information about **fnlwgt** is non-existent it can be removed before model training.
4. Take note of the **whitespaces (" ")**  throughout the dataset. 



**Dataset Credits:** https://archive.ics.uci.edu/ml/datasets/adult 

**Dataset Creator:**
```
Dua, D., & Graff, C.. (2017). UCI Machine Learning Repository.
```

---

### List of Activities
 
**Activity 1:** Main Page Configuration
  
**Activity 2:** View Data Configuration

---

#### Creating Python File for the Census Visualisation Web App


In this activity, you have to create a Python file `census_main.py` in the Sublime editor and save it in the `Python_scripts` folder. 

Copy the code given below in the `census_main.py` file. You are already aware of this code which creates a function that will load the data from the csv file.

**Dataset Link:** https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/whitehat-ds-datasets/adult.csv

**Note:** Do not run the code shown below. It will throw an error.


In [2]:
# Open Sublime text editor, create a new Python file, copy the following code in it and save it as 'census_main.py'.
# Import modules
import numpy as np
import pandas as pd

import streamlit as st
@st.cache()
def load_data():
	# Load the Adult Income dataset into DataFrame.

	df = pd.read_csv('https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/whitehat-ds-datasets/adult.csv', header=None)
	df.head()

	# Rename the column names in the DataFrame using the list given above. 

	# Create the list
	column_name =['age', 'workclass', 'fnlwgt', 'education', 'education-years', 'marital-status', 'occupation', 
               'relationship', 'race','gender','capital-gain', 'capital-loss', 'hours-per-week', 'native-country', 'income']

	# Rename the columns using 'rename()'
	for i in range(df.shape[1]):
	  df.rename(columns={i:column_name[i]},inplace=True)

	# Print the first five rows of the DataFrame
	df.head()

	# Replace the invalid values ' ?' with 'np.nan'.

	df['native-country'] = df['native-country'].replace(' ?',np.nan)
	df['workclass'] = df['workclass'].replace(' ?',np.nan)
	df['occupation'] = df['occupation'].replace(' ?',np.nan)

	# Delete the rows with invalid values and the column not required 

	# Delete the rows with the 'dropna()' function
	df.dropna(inplace=True)

	# Delete the column with the 'drop()' function
	df.drop(columns='fnlwgt',axis=1,inplace=True)

	return df

census_df = load_data()

ModuleNotFoundError: ignored

---

#### Activity 1: Main page configuration 

In the main page i.e `census_main.py` , perform the following task:

1. Modify the `page_title`, `page_icon`, `layout` and `initial_sidebar_state` attributes to configure the default settings of the web page. 

 <img src='https://i.imgur.com/u7huSLR.png' width=300/>
 
 **Hint:** Use the `set_page_config()` function.

2. Add the name of the app and its description.

  <img src='https://i.imgur.com/PSeajjk.png' width=500/>


  

In [3]:
# Configure the main page by setting its title and icon that will be displayed in a browser tab.
# Import the streamlit Python module.
import streamlit as st
# Configure your home page.
st.set_page_config(page_title = "census web app", page_icon = "random", layout = "centered", initial_sidebar_state = "auto")
# Set the title to the home page contents.
st.title("Census Visualisation Web App")
# Provide a brief description for the web app.
st.write("This app allows a user to explore and visualize the census data")

ModuleNotFoundError: ignored

---

####Activity 2: View Dataset and Explore its Column Data

In this activity, you have to perform the following task on the same page i.e `census_main.py` :

1. Display the original dataset. 

  <img src='https://i.imgur.com/hW8Kylv.png' width=500/>

 **Hint:** Use the `st.beta_expander()` function to display or hide the DataFrame.

2. Display the **column names**, **column data-type**, individual **column data** and the **mean**, **median**, **quartile**, **standard deviation** values of the numeric columns of a dataset using the `table()` function of the Streamlit module.

 **Hint:** Display the widgets horizontally using the **`st.beta_columns()`** function.

In [None]:
# View Dataset Configuration
st.header("View Data")
# Add an expander and display the dataset as a static table within the expander.
with st.beta_expander("View Data Set"):
  st.table(census_df)
# Create three beta_columns.
beta_col1, beta_col2, beta_col3 = st.beta_columns(3)
# Add a checkbox in the first column. Display the column names of 'census_df' on the click of checkbox.
with beta_col1:
  if st.checkbox("show all column name"):
    st.table(list(census_df.columns))
# Add a checkbox in the second column. Display the column data-types of 'census_df' on the click of checkbox.
with beta_col2:
  if st.checkbox("show all column datatype"):
    st.table(list(census_df.dtypes))
# Add a checkbox in the third column followed by a selectbox which accepts the column name whose data needs to be displayed.
with beta_col3:
  if st.checkbox("show all column data"):
    column_data = st.selectbox("select column", tuple(census_df.columns))
    st.table(list(census_df.column_data))
# Display summary of the dataset on the click of checkbox.
if st.checkbox("show all summary"):
  st.table(list(census_df.describe()))

**Expected Output of the Home Page**:

<img src='https://i.imgur.com/2cLTGHq.png' width=660/>

---

### Submitting the Project:

1. After finishing the project, click on the **Share** button on the top right corner of the notebook. A new dialog box will appear.

  <img src='https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/project-share-images/2_share_button.png' width=500>

2. In the dialog box, make sure that '**Anyone on the Internet with this link can view**' option is selected and then click on the **Copy link** button.

   <img src='https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/project-share-images/3_copy_link.png' width=500>

3. The link of the duplicate copy (named as **YYYY-MM-DD_StudentName_Project96**) of the notebook will get copied.

   <img src='https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/project-share-images/4_copy_link_confirmation.png' width=500>

4. Go to your dashboard and click on the **My Projects** option.
   
   <img src='https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/project-share-images/5_student_dashboard.png' width=800>

  <img src='https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/project-share-images/6_my_projects.png' width=800>

5. Click on the **View Project** button for the project you want to submit.

   <img src='https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/project-share-images/7_view_project.png' width=800>

6. Click on the **Submit Project Here** button.

   <img src='https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/project-share-images/8_submit_project.png' width=800>

7. Paste the link to the project file named as **YYYY-MM-DD_StudentName_Project96** in the URL box and then click on the **Submit** button.

   <img src='https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/project-share-images/9_enter_project_url.png' width=800> 

---