Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

History grokker #29

Open
jhannah opened this issue Jul 22, 2013 · 13 comments
Open

History grokker #29

jhannah opened this issue Jul 22, 2013 · 13 comments

Comments

@jhannah
Copy link
Member

jhannah commented Jul 22, 2013

So for years I've been threatening to write a history grokker that reads our XML file and produces an RSS-style history log and a graph of # of active groups over time, etc. @rGeoffrey is presenting a "State of the Onion" at OSCON this Thursday. Maybe I aught to get around to it NOW...

grep 'status' on the current file is trivial. The trick is writing a program to understand the context of each of the 311 changes to that file since 2006. git log --patch --reverse perl_mongers.xml is a conceptual starting point. But I suspect we'll need some pretty heavy lifting before the context of many of those changes can be understood in human-readable terms suitable for an RSS feed. Something like:

  • 3c25cdc - 2006-04-30 - ignore
  • c8d7d2c - 2006-05-24 - Helsingborg.pm new group leader: Stefan Midjich
  • 98ae765 - 2006-05-26 - New group: Kaiserslautern.pm
  • ... 308 more :)

A graph of simple stats is easier. For each of those commits, just pull out the XML at that point in time and do a grep status | uniq -c...

irc.perl.org #mongers is the IRC channel for discussion. 👍

@evaddnomaid
Copy link

Sounds like fun... Do we have a sample input and sample output?

@akarelas
Copy link

One could make it with XML::MyXML (to publicize my module a bit)

@djgoku
Copy link
Contributor

djgoku commented Jul 22, 2013

I was just starting a XML::Rabbit dist for the XML file.

@djgoku
Copy link
Contributor

djgoku commented Jul 22, 2013

@evaddnomaid
Copy link

@djgoku thanks, so perl_mongers.xml (and its history) is our input? And we want a single RSS file as output, showing the changes in number of active groups over time?

Should we get a discussion going, on IRC maybe?

@djgoku
Copy link
Contributor

djgoku commented Jul 22, 2013

$ perl -Ilib t/01-pm.t 
ok 1 - The object isa PM::Grokker
1..1

@djgoku
Copy link
Contributor

djgoku commented Jul 22, 2013

wow there are 717 groups! Fun stat.

@djgoku
Copy link
Contributor

djgoku commented Jul 22, 2013

Total Groups 717:
Group statuses and counts below:

Status: Vetoed by Robert. See RT 57812. Count: 1    
Status: active               Count: 254  
Status: dead                 Count: 25   
Status: disabled             Count: 1    
Status: disbanded to make room for other groups -jhannah 20061203 Count: 1    
Status: gone                 Count: 17   
Status: inactive             Count: 172  
Status: leb                  Count: 13   
Status: mlb                  Count: 40   
Status: on hold              Count: 1    
Status: sleeping             Count: 35   
Status: spam                 Count: 1    
Status: undef                Count: 152  
Status: unknown              Count: 4

@n1vux
Copy link
Contributor

n1vux commented Jul 23, 2013

I've always been fond of XML::Twig for plucking bits from XML.
step 0. Omit all but latest version per date.
step 1. Culling just Group Name and Status into a proxy file for each version.
step 2. Diff the proxy files to generate status change events for each date.
Delete dates with no change in group name/status.
step 3. Statistics for each date, create time series.

@oalders
Copy link
Contributor

oalders commented Jul 8, 2023

Is it possible to regenerate the numbers that @djgoku ran back in 2013?

@djgoku
Copy link
Contributor

djgoku commented Jul 8, 2023

Is it possible to regenerate the numbers that @djgoku ran back in 2013?

I don’t remember writing this, but has what you want I think. lol

https://github.com/djgoku/PM-Grokker/blob/master/bin/grokker.pl

@oalders
Copy link
Contributor

oalders commented Jul 9, 2023

That's it! Thanks, @djgoku. 😄

@oalders
Copy link
Contributor

oalders commented Jul 9, 2023

Status: Vetoed by Robert. See RT 57812. Count: 1
Status: active               Count: 210
Status: dead                 Count: 24
Status: disabled             Count: 1
Status: disbanded to make room for other groups -jhannah 20061203 Count: 1
Status: gone                 Count: 16
Status: inactive             Count: 237
Status: leb                  Count: 13
Status: mlb                  Count: 40
Status: on hold              Count: 1
Status: sleeping             Count: 34
Status: spam                 Count: 1
Status: undef                Count: 148
Status: unknown              Count: 4

So, the diff would be:

git diff --no-index before.txt after.txt
diff --git a/before.txt b/after.txt
index c38a104..719d684 100644
--- a/before.txt
+++ b/after.txt
@@ -1,14 +1,14 @@
-Status: active               Count: 254
-Status: dead                 Count: 25
+Status: active               Count: 210
+Status: dead                 Count: 24
 Status: disabled             Count: 1
 Status: disbanded to make room for other groups -jhannah 20061203 Count: 1
-Status: gone                 Count: 17
-Status: inactive             Count: 172
+Status: gone                 Count: 16
+Status: inactive             Count: 237
 Status: leb                  Count: 13
 Status: mlb                  Count: 40
 Status: on hold              Count: 1
-Status: sleeping             Count: 35
+Status: sleeping             Count: 34
 Status: spam                 Count: 1
-Status: undef                Count: 152
+Status: undef                Count: 148
 Status: unknown              Count: 4
 Status: Vetoed by Robert. See RT 57812. Count: 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants