Skip to content

Replication package for ‘‘Why is Developing Machine Learning Applications Challenging? A Study on Stack Overflow Posts’’

Notifications You must be signed in to change notification settings

mshangiti/esem2019

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is this?

This repo contains a replication package for the paper entitled ‘‘Why is Developing Machine Learning Applications Challenging? A Study on Stack Overflow Posts’’ published as part of the 2019 ESEM conference.

@inproceedings{DBLP:conf/esem/AlshangitiSMLY19,
  author    = {Moayad Alshangiti and
               Hitesh Sapkota and
               Pradeep K. Murukannaiah and
               Xumin Liu and
               Qi Yu},
  title     = {Why is Developing Machine Learning Applications Challenging? {A} Study
               on Stack Overflow Posts},
  booktitle = {2019 {ACM/IEEE} International Symposium on Empirical Software Engineering
               and Measurement, {ESEM} 2019, Porto de Galinhas, Recife, Brazil, September
               19-20, 2019},
  pages     = {1--11},
  publisher = {{IEEE}},
  year      = {2019},
  url       = {https://doi.org/10.1109/ESEM.2019.8870187},
  doi       = {10.1109/ESEM.2019.8870187},
  timestamp = {Wed, 23 Oct 2019 17:15:06 +0200},
  biburl    = {https://dblp.org/rec/bib/conf/esem/AlshangitiSMLY19},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

What is included?

The package consists of the following:

  1. code.R: This file contains all the code needed to replicate all the figures found in the paper. We have provided detailed commentary with the code to help explain the content.

  2. quantitative_sample: This folder contains the StackOverflow quantitative study sample discussed in the paper consisting of 86983 ML related questions posts. Moreover, the answers (when an accepted answer is available) are also provided for the sample. Finally, we provided the web development sample that was used as part of RQ1 to compare the response time between web development questions and machine learning questions.

  3. qualitative_sample: This folder contains the StackOverflow qualitative study sample discussed in the paper consisting of 684 ML related questions generated by 50 unique users alongside their labels. Moreover, the user expertise labels are also provided.

  4. custom: This folder encapsulates all other data used within the paper. Specifically, the LDA and topic-term matrices for the discovered 30 topics. The tags and their statistics, and the ExpertiseRank score generated to compare the number of experts in machine learning against web development.

About

Replication package for ‘‘Why is Developing Machine Learning Applications Challenging? A Study on Stack Overflow Posts’’

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages