Skip to content

Latest commit

 

History

History
362 lines (359 loc) · 17.2 KB

README.md

File metadata and controls

362 lines (359 loc) · 17.2 KB

Awarded 16 analytical reports with Python at 16GB Kernel of Kaggle.

About the Author

Score Table

Departments Top Levels Highest Rank Awarded Medals
Code 0.2% 317/161,898 3 Silver 13 Bronze
Discussion 0.3% 588/188,433 100 Bronze
Datasets 1% 354/34,643 3 Bronze
Competitions 20-30%

Abstracts

3 Silver Medals

  1. Optimized LightGBM with Optuna adding SAKT Model
    • Lead sentences
      • Submitted code using two Ensemble Learning Methods.
      • Used 100 million rows of training data for prediction on a 16GB Kernel removing unnecessary objects.
    • Issue
      • Algorithms for TOEIC Learning Applications
    • Significance
      • Predict percentage of correct answers based on user's behavioral history.
      • User's percentage of correct answers will increase with the number of problems solved.
    • Purpose
      • Optimize Binary Classification for AUC.
    • Methodology:
      • Ensemble Learning of LightGBM and SAKT.
      • Hyperparameter Optimization with Optuna.
    • Results
      • Score: AUC = 0.781.
      • Code: 31 Points.
    • Considerations
      • Obsessed with Models, Feature Engineering Remains a Challenge.
      • Systematizing Multiple Models will also be a challenge in the future.
    • Conclusion
      • Code Silver Medal
  2. LightGBM Classifier and Logistic Regression Report
    • Lead sentences
      • Optimized Classification of Anonymized Raw Data from Stock Market on 16GB Kernel.
      • Contributed code that systematizes Ensemble Learning and Logistic Regression.
    • Issue
      • Utility Function Optimization of Supply and Demand Forecasting in Securities Markets.
    • Significance
      • Calculate based on indicators of absence or degree of stock returns.
      • Optimize the behavior of whether to trade or not.
    • Purpose
      • AI Dev for Profit Maximization.
    • Methodology
      • Optimal classification of LightGBM.
      • Logit Transformation of Purpose Variables Based on Probability Distributions.
    • Results
      • Score: 3741.118 (Outside of Medal Zone).
      • Code: 33 Points.
    • Considerations
      • Utility function was not fully deciphered
      • Which left some issues for paper surveys.
      • The report was appreciated by other Kagglers.
    • Conclusion
      • Code Silver Medal
  3. Optimize LightGBM HyperParameter with Optuna and GPU
    • Lead sentences
      • Unprecedented LightGBM Hyperparameter Optimization on GPU.
      • Procedure was annexed and highly evaluated.
    • Issues
      • Preliminaries of “LightGBM Classifier and Logistic Regression Report“ .
    • Significance
      • High Parameter Optimization.
      • There were few precedents for LightGBM.
    • Purpose
      • Code submission for optimizing LightGBM Hyperparameter on GPU.
    • Methodology
      • A survey of prior case studies using Optuna for LightGBM.
      • Procedure of ssubmissions.
    • Results
      • Run: 953.9s on GPU
      • Code: 31 Points.
    • Consideration
      • Available hyperparameter optimization of futures.
    • Conclusion
      • Code Silver Medal.

13 Bronze Medals

  1. Optimized Logit LightGBM Classifier and CNN Models
    • Lead sentences
      • Submitted a simulation of Multiple Model Systematization.
      • Based on this failure, I was able to concentrate on LightGBM Optimization and Inference.
    • Issue
      • Exploring Optimization Models
    • Significance
      • Simulation iterations of Optimization Model.
    • Purpose
      • Optimize Utility Function by systematizing Multiple Models.
    • Methodology
      • Applying the Logit Transform to LightGBM.
      • Explore combining with CNN.
    • Results
      • Score: 3344.738 (Outside of Medal Zone).
      • Code: 15 Points.
    • Considerations
      • This code does LightGBM and CNN at the same time, which was prone to overflow.
      • From now on, I will focus on one Model Optimization.
    • Conclusion
      • Code Bronze Medal
  2. Optimized LightGBM with Optuna
    • Lead sentences
      • Dev Baseline Model for Code Competition to process 100 million rows of training data.
      • The minimum performance was predicted to be 16GB.
    • Issue
      • 100 million rows of training data must be predicted on a 16GB Kernel.
    • Significance
      • This is the cornerstone of the final submission model.
      • Preprocess and Feature Engineering were adjusted for further optimization.
    • Purpose
      • Baseline Model Dev
    • Methodology
      • Binary Classification by LightGBM Optimization.
    • Results
      • Score: AUC = 0.774.
      • Code: 12 Points.
    • Considerations
      • Policy of additional development to Baseline Model.
      • The improvement of AUC by the additional development was only 0.07, which left some issues.
    • Conclusion
      • Code Bronze Medal
  3. LightGBM on GPU with Feature Engineering, Optuna, and Visualization
    • Lead sentence
      • Code Bronze Medal for first attempt at submitting code.
    • Issue
      • This was my first real effort at Kaggle.
    • Significance
      • Visualize in a timely manner, and features were studied.
      • Optuna was also used for the first time and applied later.
    • Purpose
      • Work on Feature Engineering.
    • Methodology
      • I read and referred to posted code by Kaggle Grandmaster.
    • Results
      • Code: 11 Points.
    • Consideration
      • I could gain experiences in implementing LightGBM with Optuna on GPU.
    • Conclusion
      • Code Bronze Medal.
  4. LightGBM with the Inference and Empirical Analysis
    • Lead sentences
      • In the first scored submission code, AUC = 0.76.
      • The challenges were used as the cornerstone of development experiences.
    • Issue
      • Scoring by developing additions to the submitted code for my first challenge.
    • Significance
      • A single process was limited to Model Object Generation.
    • Purpose
      • To further improve the performance of Prediction Model.
    • Methodology
      • Inference was added to improve Score.
      • Empirical Analysis between raw data and predicted results.
      • Detected significant differences in Gaussian Distribution.
    • Results
      • Score: AUC = 0.76.
      • Code: 12 Points.
    • Consideration
      • This submitted code left insufficient understanding of inference as an issue.
    • Conclusion
      • Code Bronze Medal.
  5. Submission and the Inference of LightGBM
    • Lead sentences
      • My first scoring submission code prototype
      • Few examples of Empirical Analysis, I won Code Bronze Medal.
    • Issue
      • Prototype version of submission code for first scoring.
    • Significance
      • Implementing the scoring submission code.
    • Purpose
      • Gaining development experiences.
    • Methodology
      • Model objects were coded for scoring.
      • Empirical Analysis detected a significant difference in Gaussian Distribution.
    • Result
      • Code: 7 Points.
    • Considerations
      • Actual scoring submission code became a separate file.
      • This was an opportunity for me to feel the challenge of coding.
      • Focused on its afterwards.
    • Conclusion
      • Code Bronze Medal.
  6. Market Prediction XGBoost with GPU Modified
    • Lead sentences
      • Performance comparison with LightGBM by XGBoost Optimization.
      • LightGBM takes the cake.
    • Issue
      • I seen good results with XGBoost sometimes.
    • Significance
      • Simulate on Models other than LightGBM and search for Optimized Model.
    • Purpose
      • Score improvement by XGBoost.
    • Methodology
      • GPU Implementation into XGBoost Optimization.
    • Results
      • Score: 3308.824 (Outside of Medal Zone).
      • Code: 8 Points.
    • Considerations
      • XGBoost is easy to implement due to its many precedents.
      • LightGBM is superior in performance comparison, which led me to focus on LightGBM.
    • Conclusion
      • Code Bronze Medal.
  7. Cassava Leaf Disease Best Keras CNN Tuning
    • Lead sentences
      • I also participated in a competition on image analysis, challenging myself with raw data of various properties.
      • I was left with some issues on the theoretical side, which gave me an opportunity to work from theoretical books.
    • Issue
      • I would like to try my hand at image analysis and find out what I am good at.
    • Significance
      • I want to gain experience in Keras implementation.
      • Deepen my understanding CNN.
    • Purpose
      • I learn to understand and implement acoustic analysis and image analysis.
    • Methodology
      • I complemented the advanced submission code.
    • Results
      • Score: Accuracy = 0.885.
      • Code: 18 Points.
    • Considerations
      • Theoretical aspects of acoustic analysis and image analysis remained a challenge.
      • An opportunity to raise awareness to need to start with a survey of theoretical papers.
    • Conclusion
      • Code Bronze Medal
  8. RFCX Residual Network with TPU Customized
    • Lead sentences
      • I also participated in a competition for acoustic analysis, and tried my hand at raw data of various properties.
      • I was left with some issues on the theoretical side, which gave me an opportunity to work from theoretical books.
    • Issue
      • I would like to try my hand at acoustic analysis and find out what I am good at.
    • Significance
      • I want to gain experience in Keras implementation.
      • Deepen my understanding CNN.
    • Purpose
      • I learn to understand and implement acoustic analysis and image analysis.
    • Methodology
      • I complemented the advanced submission code.
    • Results
      • Score: 0.772.
      • Code: 12 Points.
    • Considerations
      • Theoretical aspects of acoustic analysis and image analysis remained a challenge.
      • An opportunity to raise awareness to need to start with a survey of theoretical papers.
    • Conclusion
      • Code Bronze Medal.
  9. Research with Customized Sharp Weighted
    • Lead sentences
      • Work on Custom Metrics Clarification and systematization of Hyperparameters Optimization in LightGBM.
      • An each milestone optimization object generation is still important.
    • Issue
      • Private Custom Metrics were used as an Evaluation Function.
    • Significance
      • Improve prediction accuracy by elucidating Private Custom Metrics.
      • Reproducibility will be determined based on the Evaluation Function.
    • Purpose
      • Custom Metrics Clarification.
    • Methodology
      • LightGMB High Parameter Optimization.
      • Systematization with Custom Metrics Decoding Examples.
    • Results
      • Generate each Parameter Optimization Object.
      • Code: 6 Points.
    • Consideration
      • Importance of each milestone optimization object generation was reaffirmed.
    • Conclusion
      • Code Bronze Medal.
  10. Optimize CatBoost HyperParameter with Optuna and GPU
    • Lead sentences
      • Performance comparison was performed on optimized Ensemble Learning.
      • LightGBM won the prediction accuracy.
    • Issue
      • I was new to CatBoost and wanted to compare performance with LightGBM.
    • Significance
      • Performance comparison of Ensemble Learning: LightGBM, XGBoost, CatBoost, etc.
    • Purpose
      • Algorithm selection for Prediction Models.
    • Methodology
      • Hyper-parameter optimization.
      • CatBoost implementation.
    • Results
      • Score: AUC = 0.500.
      • Code: 17 Points.
    • Consideration
      • At the baseline model stage, I gave the edge to LightGBM.
    • Conclusion
      • Code Bronze Medal.
  11. LightGBM on Lyft Tabular Data added Inference and Tuning
    • Lead sentences
      • Regression Prediction of LightGBM with Grid Search and Multiple Evaluation Functions.
      • A harvest that uncovered all sorts of challenges!.
    • Issue
      • Regression Problem for Table Data Related to Automated Driving.
    • Significance
      • I want to work on Regression Prediction with LightGBM.
      • Gain further development experiences.
      • Implement multiple evaluation functions to improve accuracy.
    • Purpose
      • Improving accuracy of Regression Prediction.
    • Methodology
      • Set evaluation functions of LightGBM in MSE and RMSE.
      • Parameter search by grid search.
    • Results
      • Score: 356.084.
      • Code: 10 Points.
    • Considerations
      • Grid search shown that hyperparameter optimization is inefficient.
      • I reaffirmed the need to use feature engineering and inference.
    • Conclusion
      • Code Bronze Medal
  12. COVID-19 with H2OAutoML Baseline Model
    • Lead sentences
      • Experimented with AutoML performance, but found the original to be more powerful.
      • This led to the original development of the LightGBM optimization.
    • Issue
      • COVID-19 infection explosion and new global challenges.
    • Significance
      • Improvement of coding techniques for anonymized Table data.
      • Accumulate experiences using AutoML.
    • Purpose
      • Optimization Regression Prediction with AutoML.
    • Methodology
      • Set RMSLE as evaluation function for Regression Prediction with H2O.
      • Extract the optimized Regression Prediction Models: Deep Learning, XGBoost, GLM, GBM, etc.
    • Results
      • Score: RMSLE = 0.086.
      • Code: 6 Points.
    • Considerations
      • Original development was more powerful than H2OAutoML.
      • Opportunity to work on Optimized Regression Prediction with LightGBM.
    • Conclusion
      • Code Bronze Medal.
  13. Optimized Predictive Model with H2OAutoML
    • Lead sentences
      • Even in Binary Classification, AutoML was found to be inferior to proprietary.
      • It is thought that the difference was due to Preprocess and Feature Engineering.
    • Issue
      • Regression Prediction by H2OAutoML was inferior to original development.
    • Significance
      • It was unclear whether results would be similar to Regression Prediction.
    • Purpose
      • Experiment on H2OAutoML in Binary Classification.
    • Methodology
      • Set RMSLE as the evaluation function for Binary Classification with H2O.
      • Extract the Optimized Binary Classification Models: Deep Learning, XGBoost, GLM, GBM. ey/tc.
    • Results
      • Score: AUC = 0.850.
      • Code: 5 Points.
    • Considerations
      • The performance was higher than that of Regression Prediction Case.
      • Process and Feature Engineering itself is not automated.
      • It has to be developed independently.
    • Conclusion
      • Code Bronze Medal.