I try to run a simple Decision Tree simulation of Movie collectiond data by finding out how each independent variables affect budget. Using python, we can visualize the result of the decision.
- import related libraries
- Extracting the data Movie_regression.csv into pandas dataframe
- Treat the missing value by replacing with mean.
- Creating dummy variable, for variable with string value
- Define X variable and y variable
- Use test train split with test size: 0.2
- import decision tree regressor, and define prediction of y
- Find mean squared error and r2 score, define dot data
- Display regression tree graph










