new papers and removed a gem for compilation on my home computer

macarbonneau · Oct 31, 2023 · a1ec432 · a1ec432
1 parent 60557fc
commit a1ec432
Show file tree

Hide file tree

Showing 10 changed files with 52 additions and 28 deletions.
diff --git a/Gemfile b/Gemfile
@@ -15,7 +15,6 @@ group :jekyll_plugins do
     gem 'jekyll-toc'
     gem 'jekyll-twitter-plugin'
     gem 'jemoji'
-    gem 'mini_racer'
     gem 'unicode_utils'
     gem 'webrick'
 end

diff --git a/_bibliography/papers.bib b/_bibliography/papers.bib
@@ -15,16 +15,22 @@ @ARTICLE{Carbonneau2022
   selected={true}
 }
 
-@misc{vanniekerk2023rhythm,
-      title={Rhythm Modeling for Voice Conversion}, 
-      author={Benjamin {van Niekerk} and Marc-André Carbonneau and Herman Kamper},
-      year={2023},
-      eprint={2307.06040},
-      journal ={arXiv},
-      preview={Urhythmic.png},
-      bibtex_show={true},
-      selected={true},
-}
+@ARTICLE{vanniekerk2023rhythm,
+  author={{van Niekerk}, Benjamin and Carbonneau, Marc-André and Kamper, Herman},
+  journal={IEEE Signal Processing Letters}, 
+  title={Rhythm Modeling for Voice Conversion}, 
+  year={2023},
+  volume={30},
+  number={},
+  pages={1297-1301},
+  preview={Urhythmic.png},
+  bibtex_show={true},
+  selected={true},
+  abstract={Voice conversion aims to transform source speech into a different target voice. However, typical voice conversion systems do not account for rhythm, which is an important factor in the perception of speaker identity. To bridge this gap, we introduce Urhythmic—an unsupervised method for rhythm conversion that does not require parallel data or text transcriptions. Using self-supervised representations, we first divide source audio into segments approximating sonorants, obstruents, and silences. Then we model rhythm by estimating speaking rate or the duration distribution of each segment type. Finally, we match the target speaking rate or rhythm by time-stretching the speech segments. Experiments show that Urhythmic outperforms existing unsupervised methods in terms of quality and prosody.},
+  doi={10.1109/LSP.2023.3313515}
+  }
+
+
 
 
 @article{langevin_energy_2021,
@@ -232,9 +238,10 @@ @inproceedings{10.1145/3424636.3426898
 booktitle = {Proceedings of the 13th ACM SIGGRAPH Conference on Motion, Interaction and Games},
 articleno = {9},
 numpages = {11},
-keywords = {artist controlled character creation, fine facial features, face texture generation, image-to-image translation},
+keywords = {artist controlled character creation, image-to-image translation, face texture generation, fine facial features},
 location = {Virtual Event, SC, USA},
 series = {MIG '20},
+preview={facemig.png},
 bibtex_show={true},
+selected={false},
 }
-
diff --git a/_config.yml b/_config.yml
@@ -119,18 +119,18 @@ bing_site_verification:  # out your bing-site-verification ID (Bing Webmaster)
 # Blog
 # -----------------------------------------------------------------------------
 
-blog_name: News # blog_name will be displayed in your blog page
-blog_nav_title: news # your blog must have a title for it to be displayed in the nav bar
-blog_description: This is unlikely to be up-to-date :)
-permalink: /blog/:year/:title/
+#blog_name: News # blog_name will be displayed in your blog page
+#blog_nav_title: news # your blog must have a title for it to be displayed in the nav bar
+#blog_description: This is unlikely to be up-to-date :)
+#permalink: /blog/:year/:title/
 
 # Pagination
-pagination:
-  enabled: true
+# pagination:
+#   enabled: true
 
-related_blog_posts:
-  enabled: true
-  max_related: 5
+# related_blog_posts:
+#   enabled: true
+#   max_related: 5
 
 # -----------------------------------------------------------------------------
 # Collections

diff --git a/_news/announcement_1.md b/_news/announcement_1.md
@@ -1,8 +1,10 @@
 ---
 layout: post
-date: 2023-06-03 15:59:00-0400
+date: 2023-07-15 15:59:00-0400
 inline: true
 related_posts: false
 ---
 
-I am moving my old [personal page](https://sites.google.com/site/marcandrecarbonneau/publications) to github.
+Ubisoft had published a [blog page](https://www.ubisoft.com/en-us/studio/laforge/news/5ADkkY0BMG9vNSDuUMtkeg/zeroeggs-zeroshot-examplebased-gesture-generation-from-speech) describing our system for gesture generation conditioned on speech.
+\
+This system was presented in ["ZeroEGGS: Zero-shot Example-based Gesture Generation from Speech"](https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.14734) and showcased on [2 minute papers](https://www.youtube.com/watch?v=Dt0cA2phKfU&ab_channel=TwoMinutePapers).
diff --git a/_news/announcement_EDM_sound.md b/_news/announcement_EDM_sound.md
@@ -0,0 +1,14 @@
+---
+layout: post
+date: 2023-10-27 15:59:00-0400
+inline: true
+related_posts: false
+---
+
+Our paper EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis has been accepted for presentation at the NeurIPS Workshop on ML for Audio. This work has been done in collaboration with colleagues from Rochester University.
+\
+\
+In this paper, we propose a diffusion-based generative model in spectrogram domain under the framework of elucidated diffusion models (EDM). We also revealed a potential concern regarding diffusion based audio generation models that they tend to generate duplication of the training data. 
+\
+\
+Check out the [project page](https://agentcooper2002.github.io/EDMSound/)!
diff --git a/_news/announcement_Urhythmic.md b/_news/announcement_Urhythmic.md
@@ -1,11 +1,13 @@
 ---
 layout: post
-date: 2023-07-31 15:59:00-0400
+date: 2023-09-21 15:59:00-0400
 inline: true
 related_posts: false
 ---
 
-We released on Arxiv our latest research effort on voice conversion. In this paper we model the natural rhythm of speakers to perform conversion while respecting the target speaker's natural rhythm. We do more than approximating the global speech rate, we model duration for sonorants, obstruents, and silences.  
-
+Our paper ["Rhythm Modeling for Voice Conversion"](https://ieeexplore.ieee.org/document/10246359) has been published in IEEE Signal Processing Letters. We also released it on [Arxiv](https://arxiv.org/abs/2307.06040). 
+\
+In this paper we model the natural rhythm of speakers to perform conversion while respecting the target speaker's natural rhythm. We do more than approximating the global speech rate, we model duration for sonorants, obstruents, and silences.  
+\
 Check out the [demo page](https://ubisoft-laforge.github.io/speech/urhythmic/)!
 
diff --git a/_pages/about.md b/_pages/about.md
@@ -28,7 +28,7 @@ Here's a subset of my research interests:
 
 
 Since 2017, I work as a research scientist at Ubisoft in the [La Forge lab](https://www.ubisoft.com/en-us/studio/laforge). 
-I lead a group of resarchers applying the latest techniques in machine learning, speech, signal processingm, computer vision & graphics, animation to video games.
+I lead a group of resarchers applying the latest techniques in machine learning, speech, signal processing, computer vision & graphics, animation to video games.
 
 Before that, as a PhD student, I was affiliated with two labs:
 - LIVIA - [Laboratory for Imagery, Vision and Artificial Intelligence](https://liviamtl.ca/)

diff --git a/assets/img/BrainStorm_1.jpg.crdownload b/assets/img/BrainStorm_1.jpg.crdownload
diff --git a/assets/img/BrainStorm_2.jpg.crdownload b/assets/img/BrainStorm_2.jpg.crdownload
diff --git a/assets/img/facemig.png b/assets/img/facemig.png