- Add
logging
in every steps. - Solve
drop_duplicates=True
issue. - Understand why
drop_duplicates=True
does not remove the newer duplicate rows from the DataFrames. For reference see logs. - If
expand=False
check that CSV file is not exists. If exists raiseFileExistsError
. - Also, add a parameter to
export_dfs
method which rewrite the facets DataFrames. -
Create a function which fetches the data from website and store them in CSV formate periodically. - Handle exceptions for validation of Pydantic.
- Make the code more comprehensive and modular. Now it is more coupled and not easy to use individually and understand.
- If any error occurs after fetching the data show a button to download the un-cleaned data and then create a page to upload those data to clean.
- Add a button to download the raw response (without validating through Pydantic).
- Add new features to
SRP
class to get a better data for model building and data analysis. - Check how to identify the rental apartments/properties.
-
Show the logging into the app.
Now I want to convert this project into Streamlit app.
- Find all the cities id to create URLs.
- Create a separate module which handles the DataFrames operations.
- Create class to deal with API requests.
- Add a parameter to pass the city ID.
-
Improve logging.
- Only one logging file will be there.
- More comprehensive logging message.
-
Tackle with first data saving error. (No column to parse in the csv file.)