Skip to content
Christian Kästner edited this page Jul 5, 2022 · 468 revisions

Reading Group

The paper reading group meets weekly during the semester to discuss papers. Participation is open to all, guests are always welcome; if you are interested in receiving invitations contact the organizer.

Each week we will discuss a different paper. The paper to discuss is announced about one week in advance by the organizer. All participants are expected to read the paper before the meeting. It is recommended to take notes about insights, questions, and other points potentially worth discussing.

The goals of the reading group are:

  • Critical reflection on scientific work
  • Practice of reading and argumentation strategies
  • Exposure to a broad range of research topics
  • Practice of leading group discussions

The discussion is limited to one hour. The discussion is led by a moderator, who may also set a focus for the discussion. The moderator will kick off the meeting by giving a short summary of the paper and raising a few points for discussion. The moderator should try to incorporate all participants into the discussion. The moderator role rotates through all participants. The moderator is encouraged to help with the selection of a paper that week.

Time and location: Thursday 11am-noon at TCS 460 (remote participation is possible, zoom link on request) The reading group may start 5 min late after a short standup meeting, which external participants are welcome to skip/ignore.

Organizer: Nadia Nahar (nadian at andrew dot cmu dot edu)

Subscribe for announcements on the feature-prg@lists.andrew.cmu.edu mailing list here: https://lists.andrew.cmu.edu/mailman/listinfo/feature-prg

Agenda

The archive of discussed papers can be found here.

June 23, 2022

TBD. Moderator: TBD

June 16, 2022

Tongshuang Wu, Marco Tulio Ribeiro, Jeffrey Heer, and Daniel Weld. 2019. Errudite: Scalable, Reproducible, and Testable Error Analysis. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 747–763, Florence, Italy. Association for Computational Linguistics. Moderator: Chenyang

June 9, 2022

Foidl, Harald, Michael Felderer and Rudolf Ramler. Data Smells: Categories, Causes and Consequences, and Detection of Suspicious Data in AI-based Systems. ArXiv abs/2203.10384 (2022): n. pag. Moderator: Christian

June 2, 2022

G. Avelino, E. Constantinou, M. T. Valente and A. Serebrenik, On the abandonment and survival of open source projects: An empirical investigation. 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 2019, pp. 1-12, doi: 10.1109/ESEM.2019.8870181. Moderator: Courtney

May 19, 2022

Wang, T., Yang, D., & Wang, X. (2021). Identifying and mitigating spurious correlations for improving robustness in nlp models. arXiv preprint arXiv:2110.07736. Moderator: Chenyang

May 12, 2022

T. Wuensche, A. Andrzejak and S. Schwedes, Detecting Higher-Order Merge Conflicts in Large Software Projects. In The 13th International Conference on Software Testing, Validation and Verification (ICST), 2020, pp. 353-363, doi: 10.1109/ICST46399.2020.00043. Moderator: Paulo

May 5, 2022

Boyd, K. L. (2021). Datasheets for Datasets help ML Engineers Notice and Understand Ethical Issues in Training Data. Proceedings of the ACM on Human-Computer Interaction, 5(CSCW2), 1-27. Moderator: Nadia

April 28, 2022

Jain, D., Ngo, H., Patel, P., Goodman, S., Findlater, L., & Froehlich, J. (2020, October). Soundwatch: Exploring smartwatch-based deep learning approaches to support sound awareness for deaf and hard of hearing users. In The 22nd International ACM SIGACCESS Conference on Computers and Accessibility (pp. 1-13). Moderator: Christian

April 21, 2022

Hofmann, M., Williams, K., Kaplan, T., Valencia, S., Hann, G., Hudson, S.E., Mankoff, J. and Carrington, P., 2019. "Occupational Therapy is Making" Clinical Rapid Prototyping and Digital Fabrication. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (pp. 1-13). Moderator: Nadia

March 31, 2022

Zolyomi, A., Begel, A., Waldern, J.F., Tang, J., Barnett, M., Cutrell, E., McDuff, D., Andrist, S. and Morris, M.R., 2019. Managing stress: The needs of autistic adults in video calling. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), pp.1-29. Moderator: Courtney

March 24, 2022

Ma'ayan, D., Ni, W., Ye, K., Kulkarni, C. and Sunshine, J., 2020, April. How domain experts create conceptual diagrams and implications for tool design. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp. 1-14). Moderator: Cindy

March 17, 2022

Ross, A., Wu, T.S., Peng, H., Peters, M.E., & Gardner, M. (2021). Tailor: Generating and Perturbing Text with Semantic Controls. ArXiv, abs/2107.07150. Moderator: Chenyang

March 3, 2022

Muiruri, Dennis, Lucy Ellen Lwakatare, Jukka K Nurminen, and Tommi Mikkonen. Practices and Infrastructures for ML Systems–An Interview Study. Preprint, 2021. Moderator: Nadia

February 24, 2022

Gitelman, Lisa, Virginia Jackson, Daniel Rosenberg, Travis D. Williams, Kevin R. Brine, Mary Poovey, Matthew Stanley et al. Data bite man: The work of sustaining a long-term study. In "Raw Data" Is an Oxymoron, (2013), MIT Press: 147-166. Moderator: Christian

February 17, 2022

Gerosa, M., Wiese, I., Trinkenreich, B., Link, G., Robles, G., Treude, C., ... & Sarma, A. (2021, May). The shifting sands of motivation: Revisiting what drives contributors in open source. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE) (pp. 1046-1058). IEEE. Moderator: Courtney

February 10, 2022

Nurwidyantoro, A., Shahin, M., Chaudron, M. R., Hussain, W., Shams, R., Perera, H., ... & Whittle, J. (2022). Human values in software development artefacts: A case study on issue discussions in three Android applications. Information and Software Technology, 141, 106731. Moderator: Sivana

February 3, 2022

Matteo Interlandi, Kshitij Shah, Sai Deep Tetali, Muhammad Ali Gulzar, Seunghyun Yoo, Miryung Kim, Todd Millstein, and Tyson Condie. 2015. Titian: data provenance support in Spark. Proc. VLDB Endow. 9, 3 (November 2015), 216–227. Moderator: Chenyang

January 27, 2022

Titov, Sergey D. et al. ReSplit: Improving the Structure of Jupyter Notebooks by Re-Splitting Their Cells. ArXiv abs/2112.14825 (2021). Moderator: Cindy

January 20, 2022

Passi, Samir, and Phoebe Sengers. Making data science systems work. Big Data & Society 7, no. 2 (2020): 2053951720939605. Moderator: Nadia

January 13, 2022

Sambasivan, N., Kapania, S., Highfill, H., Akrong, D., Paritosh, P., & Aroyo, L. M. (2021, May). Everyone wants to do the model work, not the data work: Data Cascades in High-Stakes AI. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1-15). Moderator: Christian

December 10, 2021

Claes, Maëlick, et al. Do programmers work at night or during the weekend?. Proceedings of the 40th International Conference on Software Engineering. 2018. Moderator: Courtney

December 3, 2021

Neville Grech and Yannis Smaragdakis. 2017. P/Taint: unified points-to and taint analysis. Proc. ACM Program. Lang. 1, OOPSLA, Article 102 (October 2017), 28 pages. Moderator: Chenyang

November 19, 2021

Gonzalez, Danielle, Thomas Zimmermann, and Nachiappan Nagappan. The State of the ML-universe: 10 Years of Artificial Intelligence & Machine Learning Software Development on GitHub. In Proceedings of the 17th International Conference on Mining Software Repositories, pp. 431-442. 2020. Moderator: Cindy

November 12, 2021

Rakova, Bogdana, Jingying Yang, Henriette Cramer, and Rumman Chowdhury. Where responsible AI meets reality: Practitioner perspectives on enablers for shifting organizational practices. Proceedings of the ACM on Human-Computer Interaction 5, no. CSCW1 (2021): 1-23. Moderator: Nadia

November 5, 2021

Yang, Qian, Jina Suh, Nan-Chen Chen, and Gonzalo Ramos. Grounding Interactive Machine Learning Tool Design in How Non-Experts Actually Build Models. In Proceedings of the 2018 Designing Interactive Systems Conference, pp. 573-584. 2018. Moderator: Christian

October 29, 2021

J. Wang, T. -Y. KUO, L. Li and A. Zeller, Assessing and Restoring Reproducibility of Jupyter Notebooks. 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2020, pp. 138-149. Moderator: Cindy

October 22, 2021

Rodeghero, Paige, Thomas Zimmermann, Brian Houck, and Denae Ford. Please turn your cameras on: Remote onboarding of software developers during a pandemic. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 41-50. IEEE, 2021. Moderator: Courtney

October 15, 2021

Muralidhar, Nikhil, Sathappah Muthiah, Patrick Butler, Manish Jain, Yu Yu, Katy Burne, Weipeng Li et al. Using AntiPatterns to avoid MLOps Mistakes. arXiv preprint arXiv:2107.00079 (2021). Moderator: Chenyang

October 8, 2021

Pineau, Joelle, et al. Improving reproducibility in machine learning research (a report from the neurips 2019 reproducibility program). Journal of Machine Learning Research 22 (2021). Moderator: Nadia

September 29, 2021

Breck, Eric, et al. The ML test score: A rubric for ML production readiness and technical debt reduction. 2017 IEEE International Conference on Big Data (Big Data). IEEE, 2017. Moderator: Chenyang

September 22, 2021

Jacovi, Alon, Ana Marasović, Tim Miller, and Yoav Goldberg. Formalizing trust in artificial intelligence: Prerequisites, causes and goals of human trust in AI. Proc. FAccT (2021). Moderator: Courtney

September 15, 2021

Lipton, Z.C., 2018. The Mythos of Model Interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue, 16(3), pp.31-57. Moderator: Christian

September 8, 2021

Jacobs, Abigail Z., and Hanna Wallach. Measurement and fairness. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 375-385. 2021. Moderator: Nadia

September 1, 2021

Breck, Eric, Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang and Martin A. Zinkevich. Data Validation for Machine Learning. MLSys (2019). Moderator: Chenyang

August 25, 2021

Haochen He, Zhouyang Jia, Shanshan Li, Erci Xu, Tingting Yu, Yue Yu, Ji Wang, and Xiangke Liao. 2020. CP-detector: using configuration-related performance properties to expose performance bugs. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE ’20). Association for Computing Machinery, New York, NY, USA, 623–634. DOI:https://doi.org/10.1145/3324884.3416531. Moderator: Miguel

August 18, 2021

Qiu, H.S., Li, Y.L., Padala, S., Sarma, A. and Vasilescu, B., 2019. The signals that potential contributors look for when choosing open-source projects. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), pp.1-29. Moderator: Courtney

August 11, 2021

Waterman, Michael, James Noble, and George Allan. How much up-front? A grounded theory of agile architecture. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1, pp. 347-357. IEEE, 2015.. Moderator: Christian

August 4, 2021

Dilhara, Malinda, Ameya Ketkar, and Danny Dig. Understanding Software-2.0: A Study of Machine Learning library usage and evolution. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, no. 4 (2021): 1-42. Moderator: Justin

July 28, 2021

Souti Chattopadhyay, Thomas Zimmermann, and Denae Ford. 2021. Reel Life vs. Real Life: How Software Developers Share Their Daily Life through Vlogs. In Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’21), August 23–28, 2021, Athens, Greece. ACM, New York, NY, USA, 12 pages. https://doi.org/10.1145/3468264.3468599. Moderator: Kimberly

July 21, 2021

An unpublished paper draft. Moderator: Christian

July 14, 2021

X. Ma, M. Zhou, and D. Riehle, How commercial involvement affects open source projects: three case studies on issue reporting. Sci. China Inf. Sci., vol. 56, no. 8, pp. 1–13, Aug. 2013, doi: 10.1007/s11432-013-4914-6. Moderator: Theresa

July 7, 2021

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model Cards for Model Reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT* '19). Association for Computing Machinery, New York, NY, USA, 220–229. DOI:https://doi.org/10.1145/3287560.3287596. Moderator: Austin

June 30, 2021

Avelino, G. et al. On the abandonment and survival of open source projects: An empirical investigation. 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) (2019): 1-12. Moderator: Philip

June 23, 2021

Arya, Deeksha, Wenting Wang, Jin LC Guo, and Jinghui Cheng. Analysis and detection of information types of open source software issue discussions. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 454-464. IEEE, 2019. Moderator: Courtney

June 16, 2021

Amershi, Saleema, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 291-300. IEEE, 2019. Moderator: Nadia

June 9, 2021

Coelho, Jailton, and Marco Tulio Valente. Why modern open source projects fail. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pp. 186-196. 2017. Moderator: Christian

May 5, 2021

Xue Han, Tingting Yu, Michael Pradel, ConfProf: White-Box Performance Profiling of Configuration Options. In Proceedings of International Conference on Performance Engineering (ICPE), to appear, 2021. Moderator: Miguel

April 21, 2021

Naik, Aakanksha, Abhilasha Ravichander, Norman Sadeh, Carolyn Rose, and Graham Neubig. Stress test evaluation for natural language inference. arXiv preprint arXiv:1806.00692 (2018). Moderator: Christian

March 31, 2021

Adam Rule, Aurélien Tabard, and James D. Hollan. 2018. Exploration and Explanation in Computational Notebooks. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, Paper 32, 1–12. Moderator: Cindy

March 24, 2021

Kim, Miryung, Thomas Zimmermann, Robert DeLine, and Andrew Begel. Data scientists in software teams: State of the art and challenges. IEEE Transactions on Software Engineering 44, no. 11 (2017): 1024-1038. Moderator: Nadia

March 17, 2021

Michael Schweinberger, Miruna Petrescu-Prahova, Duy Q. Vu, Disaster response on September 11, 2001 through the lens of statistical network analysis, Social Networks, Volume 37, 2014. Moderator: Christian

March 10, 2021

Patrick Park, Joshua Blumenstock, and Michael Macy. 2018. The Strength of Long-Range Ties in Population-Scale Social Networks. Science 362(6421):1410-1413. Moderator: Gabriel

March 3, 2021

S. Savage, A. Monroy-Hernández, and T. Höllerer, Botivist: Calling Volunteers to Action using Online Bots. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, CSCW 2016, San Francisco, CA, USA, February 27 - March 2, 2016, pp. 811-820. Moderator: Miguel

February 24, 2021

Brown, F., Renner, J., Nötzli, A., Lerner, S., Shacham, H., & Stefan, D. (2020, June). Towards a verified range analysis for JavaScript JITs. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (pp. 135-150). Moderator: Cindy

February 17, 2021

Megh Marathe and Kentaro Toyama. 2018. Semi-Automated Coding for Qualitative Research: A User-Centered Inquiry and Initial Prototypes. Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, Paper 348, 1–12. DOI:https://doi.org/10.1145/3173574.3173922. Moderator: Nadia

February 10, 2021

Yu Huang, Kevin Leach, Zohreh Sharafi, Nicholas McKay, Tyler Santander, and Westley Weimer. 2020. Biases and differences in code review using medical imaging and eye-tracking: genders, humans, and machines. Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Association for Computing Machinery, New York, NY, USA, 456–468. DOI:https://doi.org/10.1145/3368089.3409681. Moderator: Shurui

February 3, 2021

Tianyi Zhang, Zhiyang Chen, Yuanli Zhu, Priyan Vaithilingam, Xinyu Wang, Elena Glassman. Interpretable Program Synthesis CHI '21: Proceedings of the 2021 Conference on Human Factors in Computing Systems. Moderator: Chu-Pan

January 27, 2021

Os Keyes, Jevan Hutson, and Meredith Durbin. 2019. A Mulching Proposal: Analysing and Improving an Algorithmic System for Turning the Elderly into High-Nutrient Slurry. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, USA, Paper alt06, 1–11. Moderator: Christian

January 20, 2021

T. Lopez, H. Sharp, T. Tun, A. Bandara, M. Levine and B. Nuseibeh. “Hopefully We Are Mostly Secure”: Views on Secure Code in Professional Practice” Proc. of CHASE: 61--68 (2019). Moderator: Gabriel

January 13, 2021

Marcin Copik, Alexandru Calotoiu, Tobias Grosser, Nicolas Wicki, Felix Wolf, Torsten Hoefler. Extracting Clean Performance Models from Tainted Programs. In Proc. of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Seoul, South Korea, ACM, February 2021. Moderator: Miguel

January 6, 2021

Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2016. The emerging role of data scientists on software development teams. In Proceedings of the 38th International Conference on Software Engineering. Association for Computing Machinery, New York, NY, USA, 96–107. DOI:https://doi.org/10.1145/2884781.2884783. Moderator: Cindy