Skip to content

HBernigau/StackOverflowAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

StackOverflowAnalysis

The current application provides an analysis framework for stack overflow posts.

The following features are supported:

  • automatic loading of stack-overflow posts with some specific tag (for example "microservices")
  • automatic parsing of stack-overflow posts and extraction of content and various meta data fields
  • automatic persistence of content and meta data in appropriate back-end data bases for further analysis
  • automatic execution of topic learning via latent Dirichlet allocation (LDA)
  • automatic generation of analysis artefacts for topics (word cloud and LDAvis)
  • Jupyter notebook for presentation of several descriptive statistics of the corpus and LDA results
  • Jupyter notebook for analysis of the contributor networks for these posts

On the technical side the application uses:

  • Docker
  • PostgreSQL
  • Elastic search
  • Prefect
  • Dask
  • and: lots of Python

Documentation

| ⚠️ The documentation is work in progress. |

For the project documentation see Stackoverflow analysis on readthedocs.

How to proceed

⚠️ Currently the code cannot be executed.
The reason is a structural change in the html code served by stack-overflow servers in February 2022 which requires adaptations on the parsing module. See issue 1 for details and be a little patient for the second release... 😊

Read the documentation, Stackoverflow analysis on readthedocs, especially the section on installation.

Background

This software project was part of my MBA master thesis for my Executive MBA in Business and IT, a joint
program of the Technical University of Munich (Germany) and the University of St. Gallen (Switzerland) with an exchange module in Tsinghua University, Beijing.

Many thanks at this point to my Supervisor, Prof. Barbara Weber, chair of Software Systems Programming and Development of the University of St. Gallen!

My participation in that program was financed by my Employer, d-fine GmbH, what I am also very thankful for.

I am also very thankful to my wife and to my nearly two years old son for being very supportive in that rather stressful period of time consisting basically of working, coding, fighting with the underlying technologies, reading articles and writing my Mater thesis...

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors