To get clear insights, New York TLC's data must be analyzed, key variables identified, and the dataset ensured it is ready for analysis.
I use Python to show data structuring and cleaning, as well as any matplotlib/seaborn visualizations plotted to help understand the data, a box plot of the ride durations, and some time series plots, like a breakdown by quarter or month.
The project is reaching its midpoint. Next, I get a specific assignment: to compute descriptive statistics and conduct a hypothesis test.
It’s time to work on predicting the taxi fare amounts. We are ready to build the regression model and update the client New York City TLC about our progress.
Our client, the New York City Taxi & Limousine Commission (New York City TLC), has requested that I build a machine learning model to predict if a customer will not leave a tip. They want to use the model in an app that will alert taxi drivers to customers who are unlikely to tip since drivers depend on tips.