Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[research] MSR blog posts planning #55

Closed
warenlg opened this issue Apr 10, 2019 · 14 comments
Closed

[research] MSR blog posts planning #55

warenlg opened this issue Apr 10, 2019 · 14 comments

Comments

@warenlg
Copy link

warenlg commented Apr 10, 2019

This year, @vmarkovtsev, @m09 and I are going to attend MSR taking place in Montreal on the 26th and 27th of May 2019. As discussed with @vcoisne, given the success of the MSR interview series, we should think about producing similar contents spending less efforts ("aka nightmare" Vadim).

We agreed on the following proposal that would not require much upstream devrel effort and would allow us to focus on the content of the conference:

  1. We compile together the list of most interesting talks/papers/topics that are going to be tackled during the 2 days conference. Make a planning of it to know who is going to take notes, for example:
  2. One of us is tacking notes during each selected presentation.
  3. Introduce ourselves to the authors during the conference or the beer payback, ask them their slides + questions. Tell them that we might write a blog post about their work which we would send them latter for review.
  4. Given the notes, after the conference, we compile blog posts based on the notes/paper/slides.
  5. We send the blog post to the list of authors adding interview questions.
  6. If they answer, good we would have a nice blog post. If not, we can still publish it as a MSR paper review (cf Alex's blog posts).
@warenlg
Copy link
Author

warenlg commented Apr 10, 2019

Below is the list of technical papers (no data showcase papers yet) that are the most relevant IMO according to MLonCode and what we do. In bold, you have authors that we already know personally.

We can vote among those papers, the ones that are worth taking notes by putting our name at the end of the line (of course everybody can upvote a paper if he/she wants a blog post about it)

@warenlg
Copy link
Author

warenlg commented Apr 10, 2019

Sunday 26th

Session I: Representations for Mining - Room 1

Session II: Defect Prediction and Testing - Room 2

Session V: Large-Scale Mining - Room 1

Monday 27th

Session I: APIs & Dependencies - Room 1

Session II: Automatic Summarization - Room 2

Session V: Collaboration & Communication - Room 1

Session VIII: Software Quality - Room 2

Session IX: Traceability - Room 1

@bzz
Copy link

bzz commented Apr 11, 2019

Just FYI - I would be very happy to help with any of those, so please let me know if any help is needed.

11:16 - 11:22 (short) PathMiner : A Library for Mining of Path-Based Representations of Code - Vladimir Kovalenko, Egor Bogomolov, Timofey Bryksin, Alberto Bacchelli from JetBrains Research and University of Zurich

This is about https://github.com/vovak/astminer and 2 of the authors Timofey and Vladimir were at our FOSDEM 2019 ML on Code DevRoom and a speaker's dinner.

Semantic Source Code Models Using Identifier Embeddings - Vasiliki Efstathiou, Diomidis Spinellis from Athens University of Economics and Business

Is almost exactly id2vec + WMD that we did before, the only difference is that

@vmarkovtsev vmarkovtsev changed the title MSR blog posts planning [research] MSR blog posts planning May 23, 2019
@vcoisne
Copy link

vcoisne commented May 29, 2019

@warenlg @vmarkovtsev @m09 How many blog posts will we have in the end and what are the target dates for publication ? Trying to get some visibility on our content calendar.

@vmarkovtsev
Copy link
Collaborator

Waren is on vacations till 14th, and I am going to vacations on 24th. I scheduled a meeting with Waren and Hugo on 14th to discuss our opportunities and plans. I would say that we will certainly not finish anything in June, and the most likely date of the first post is at the end of July.

@vcoisne
Copy link

vcoisne commented Jun 3, 2019

@vmarkovtsev What about a quick MSR recap blog post in June with some tweet, link to slides from your favorite talks, photos from the conference and beer payback event, etc ?

@vmarkovtsev
Copy link
Collaborator

That's an option, sounds reasonable...

@warenlg
Copy link
Author

warenlg commented Jun 4, 2019

From what I've seen this year, I suggest that we could write blog posts about the following topics/people:

  • [friend] Stefano Zacchiroli from Software Heritage on their impressive contribution with their 85 million repositories dataset
  • [friend] Katsuro Inoue - author of CCFinder, pioneer in code deduplication
  • Rob Deline from Microsoft Reasearch - speaker for the keynote and author of the nice paper Software Engineering for Machine Learning: a case study
  • Bart Theeten from Nokia Bell Labs who made a relevant contributions for us with his paper Import2vec: learning embeddings for software libraries + his 3D visualizations were super cool
  • Daniel German from UVictoria author of The perils of Mining GitHub which got he most influential paper award. His presentation at MSR about the future advances in version control was pretty interesting and original.
  • [friend] Nicolas Harrand et Benoit Baudry from KTH made 2 nice presentations about the Maven dependency graph and software diversity. More generally I believe they work on other cool projects we could have a few words about. Hugo visited them 3 months ago.
  • [friend] Daniel Perez made also a relevant talk for us about Cross-language clone detection by learning over abstract syntax trees, quite similar to what we are doing with Gemini. We already invited him to describe his approach during one of the reading club session.
  • [friend] Sebastian Baltes who introduced the SOTorrent dataset at MSR last year. I largely talked with him last year and this year again. His dataset has been recognized as a major contribution in SE research in general and has been selected as the reference dataset for the Mining Challenge this year.

@vmarkovtsev
Copy link
Collaborator

@vmarkovtsev
Copy link
Collaborator

The planning is done.

@vcoisne
Copy link

vcoisne commented Jun 25, 2019

How is this coming along @vmarkovtsev @warenlg @m09 ? Can we create new issues on https://github.com/src-d/blog to track each post individually ?

@m09
Copy link
Contributor

m09 commented Jul 24, 2019

How is this coming along @vmarkovtsev @warenlg @m09 ? Can we create new issues on https://github.com/src-d/blog to track each post individually ?

Forgot to reply on this issue aswell but it was decided in DMs on slack that we should indeed create them some time ago. I'll do it for the posts that we did not start yet.

@m09
Copy link
Contributor

m09 commented Jul 24, 2019

As you could see from the reference spam above, we now have issues for all the blog posts left to write, please comment on the issue when you start working on a post to avoid duplication of effort! :)

@vmarkovtsev
Copy link
Collaborator

The blog posts have been planned. The original issue can be closed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants