Skip to content




@LxMLS @Helsinki-NLP
Block or Report

Block or report aarnetalman

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse


  • ☁️ I the Global AI/ML Practice Lead at Nordcloud, an IBM Company. Nordcloud is a leading European public cloud managed and professional service provider.

  • 🔬 I'm also a researcher and a PhD student in Language Technology at University of Helsinki. My research focuses on natural language processing, natural language understanding, machine translation and machine learning. I'm currently working on my PhD on Natural Language Inference.

Visit my website for more details.

Some of my academic work

  • NLI Data Sanity Check: This repository contains data and a sample script for the paper: Aarne Talman, Marianna Apidianaki, Stergios Chatzikyriakidis, Jörg Tiedemann. 2021 (forthcoming). NLI Data Sanity Check: Assessing the Effect of Data Corruption on Model Performance. Proceedings of NoDaLiDa.

  • Prosody: Contains the largest annotated dataset of English language with labels for prosodic prominence. Also contains code for predicting prosodic prominence from written text using different models, like BERT and BiLSTM. The prosody corpus contains automatically generated, high quality prosodic annotations for the LibriTTS corpus (Zen et al. 2019) using the Continuous Wavelet Transform Annotation method (Suni et al. 2017). [paper]

  • Natural Language Inference system (HBMP): Natural language inference system written in Python and PyTorch implementing the HBMP sentence encoder. [paper]

  • NLP Notebooks: Jupyter notebooks exploring different NLP/ML use cases and tasks. Some of the notebooks have been published as blog posts on my website

  • NLI with Transformers: Code for fine-tuning different transformers models with NLI data.


  1. Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text

    Python 146 22

  2. Sentence Embeddings in NLI with Iterative Refinement Encoders

    Python 74 15

  3. Data and scripts for a diagnostics test suite which allows to assess whether an NLU dataset constitutes a good testbed for evaluating the models' meaning understanding capabilities.

    Jupyter Notebook 3

  4. Jupyter notebooks exploring different NLP/ML use cases and tasks

    Jupyter Notebook

  5. Fine-tune transformers with NLI data


224 contributions in the last year

May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr Mon Wed Fri
Activity overview
Contributed to aarnetalman/blog, aarnetalman/aarnetalman, aarnetalman/runlogger and 5 other repositories

Contribution activity

May 2021

aarnetalman has no activity yet for this period.

Seeing something unexpected? Take a look at the GitHub profile guide.