Ph.D. Candidate, Department of Artificial Intelligence, Korea University
Research in Affective Computing:
Emotional Speech Synthesis • Conversational Speech Synthesis • Full Duplex Speech-to-Speech • Emotion Recognition
• Emotional Voice Conversion • Multimodal Affective Modeling
-
[Sep. 2022 - Present] Korea University, Seoul, South Korea.
- Integrated M.S.&Ph.D. in Artificial Intelligence in Pattern Recognition & Machine Learning Lab, under the supervision of Seong-Whan Lee.
- GPA: 4.19 / 4.50
-
[Mar. 2016 – Jul. 2022] Hanyang University, Ansan, South Korea.
- B.S. degree in Applied Mathematics.
- GPA: 4.40 / 4.50
- Institution: Samsung Research, Korea
- Duration: May. 2024 ~ Dec. 2024
- Institution: Thomas Crown, USA
- Duration: Sep. 2025 - Oct. 2025
- Institution: Murf AI, USA
- Duration: Oct. 2025 ~ Oct. 2026
-
[Jan. 2022 - Feb.2022] RexSoft (Intern) | Development Department
- Debugged and enhanced the REX program
-
[Jun. 2021 – Jul.2021] Visang Education (Intern) | E-Learning Content Planning Department (2021)
- Managed data entry, validation, analysis, and report creation
[2026]
-
D.-H. Cho, H.-S. Oh, S.-B. Kim, and S.-W. Lee, “Affectron: Emotional Speech Synthesis with Affective and Contextually Aligned Nonverbal Vocalizations,” (Under Review), 2026.
-
H.-S. Oh, D.-H. Cho, S.-B. Kim, and S.-W. Lee, “Toward Complex-Valued Neural Networks for Waveform Generation,” in Proc. Conference of the International Conference on Learning Representations (ICLR), 2026.
[2025]
-
D.-H. Cho, H.-S. Oh, S.-B. Kim, and S.-W. Lee, “DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for Cross-Speaker Emotion Transfer in Text-to-Speech,” in Proc. Conference of the International Speech Communication Association (INTERSPEECH), 2025.
-
N.-G. Kim, D.-H. Cho, S.-B. Kim, and S.-W. Lee, “Spotlight-TTS: Spotlighting the Style via Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech,” in Proc. Conference of the International Speech Communication Association (INTERSPEECH), 2025.
-
D.-H. Cho*, H.-S. Oh*, S.-B. Kim*, and S.-W. Lee, “EmoSphere-SER: Enhancing Speech Emotion Recognition through Spherical Representation with Auxiliary Classification,” in Proc. Conference of the International Speech Communication Association (INTERSPEECH), 2025.
-
D.-H. Cho, H.-S. Oh, S.-B. Kim, and S.-W. Lee, “EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector,” in Proc. IEEE Transactions on Affective Computing (TAFFC), 2025.
-
H.-S. Oh, S.-H. Lee, D.-H. Cho, and S.-W. Lee, “DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations Without Text Alignment,” in Proc. IEEE Transactions on Affective Computing (TAFFC), 2025.
[2024]
-
J.-E. Lee, S.-B. Kim, D.-H. Cho, and S.-W. Lee, “PromotiCon: Prompt-Based Emotion Controllable Text-to-Speech via Prompt Generation and Matching,” in Proc. IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2024.
-
D.-H. Cho, H.-S. Oh, S.-B. Kim, S.-H. Lee, and S.-W. Lee, “EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech,” in Proc. Conference of the International Speech Communication Association (INTERSPEECH), 2024.
-
Awards
-
[2025] Excellence Award
Extreme-Noise Speech Recognition & Restoration AI Model Development Competition (AI Frontier Challenge) — Korea Artificial Intelligence Association (KAIA) -
[2021] Excellence Award
Credit Card User Delinquency Prediction AI Competition — Hanyang University & Dacon
-
-
Reviewer
- Transactions on Affective Computing (TAFFC)
- IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)
- IEEE Signal Processing Letters (SPL)
- IEEE International Conference on Systems, Man, and Cybernetics (SMC)
- International Conference of the International Speech Communication Association (INTERSPEECH)
-
Method for Cross-Speaker Emotion Transfer in Text-to-Speech Using Disentangled Emotion Representations via Self-Supervised Distillation, 10-2025-0130127.
-
Method and System for Expressive Text-to-Speech via Voiced-Aware Style Extraction and Style Direction Adjustment, 10-2025-0116457.
-
Apparatus and Method for Speech Synthesis, 10-2025-0088028.
-
Method, Device, and Program for synthesizing voices Expressing emotions Based on Prompts, 10-2024-0099370.
-
Emotional Expression Voice Generation Apparatus and Method Capable of Controlling Emotional Style and Intensity Using Continuous Emotional Dimensions, 10-2024-0029066.
