Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle huge log files (was crashes on a large custom log file) #39

Closed
GoogleCodeExporter opened this issue Feb 9, 2016 · 8 comments
Closed

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
1. Create a large custom log file. In my case 700Mb
2. Run gource
3. Crash!

What is the expected output? What do you see instead?
That gource will run. I see a crash.


What version of the product are you using? On what operating system?
0.23 on Windows

Please provide any additional information below.
N/A

Original issue reported on code.google.com by Dudley....@gmail.com on 21 Jan 2010 at 7:56

@GoogleCodeExporter
Copy link
Author

I will have a think about how to handle this, its seems like an edge case 
though. The
largest real-world log I currently test with is from the Linux Kernel, and 
that's
only about 40 megs.

Thanks for the report.

Original comment by acaudw...@gmail.com on 22 Jan 2010 at 11:42

  • Changed title: Handle huge log files (was crashes on a large custom log file)
  • Changed state: Accepted
  • Added labels: Priority-Low, Type-Enhancement
  • Removed labels: Priority-Medium, Type-Defect

@GoogleCodeExporter
Copy link
Author

Just to let you know this is a log from a real-world clear case. I was able to 
get it
to display (albeit slowy) in codeswarm, since codeswarm supports a pre-sorted 
log file.

Original comment by Dudley....@gmail.com on 22 Jan 2010 at 4:11

@GoogleCodeExporter
Copy link
Author

Hi. This should be fixed now hopefully.

If the log file given to Gource is bigger than 100 megs, it will fall back to 
just
seeking the file handle rather than loading the file into memory (which makes 
the
seekbar feature a bit clunky, but should at least work). 

http://gource.googlecode.com/files/gource-0.24-beta2.tar.gz

Original comment by acaudw...@gmail.com on 31 Jan 2010 at 10:29

  • Changed state: Fixed

@GoogleCodeExporter
Copy link
Author

Dudley, would you be willing to share your script for creating the custom log 
file
format from cleartool lshistory output?  Thanks.

Original comment by djpotte...@gmail.com on 8 May 2010 at 5:12

@GoogleCodeExporter
Copy link
Author

@djpotter77

Sure. I will need to dig it up, but I should be able to get it this week 
sometime.

Original comment by Dudley....@gmail.com on 9 May 2010 at 3:09

@GoogleCodeExporter
Copy link
Author

# This is the basic script which actually converts to a codeswarm format, but 
you 

# can easily modify it to work for gource. I actually wrote a separate script to

# convert from code_swarm to gource.


#!/usr/local/bin/python

"""
   Take the output of the clear case history command
   ct lshistory -all -fmt "Element: %n| Date: %d| User:%u| Operation: %e| Object:
%[type]p| SimpleType: %m| OperationKind: %o\n"
   And turn it into something usable by code_swarm
"""

fileTypesWeCareAbout = [ 'compressed_file', 'compressed_text_file', 'file',
'html', 'text_file', 'xml']



import sys
import time

def processDate(date):
   d = date[:-6]
   d = time.strptime(d, "%Y-%m-%dT%H:%M:%S")
   return int(time.mktime(d))*1000

def processElement(e):
   return e.split("@@")[0] # just strip off the version info

def processLineIntoTuple(line):
   """Take the line and split it out into a dictionary"""
   d = {}
   for i in line.split("|"):
      l = i.split(":",1)
      if (len(l)) == 2:
         d[l[0].strip()] = l[1].strip()
      else:
         d[l[0].strip()] = ""
   return d

def UseThis(d):
   try:
      if d['OperationKind'] != "checkin":
         return 0
      if d['Object'] not in fileTypesWeCareAbout:
         return 0
   except:
      return 0

   return 1

def XMLize(d):
   print '<event date="%d" filename="%s" author="%s" />' % (d['Date'], d['Element'],
d['User'])


for x in sys.stdin.readlines():
   d = processLineIntoTuple(x)
   if UseThis(d):
      d['Date'] = processDate(d['Date'])
      d['Element'] = processElement(d['Element'])
      XMLize(d)

Original comment by Dudley....@gmail.com on 11 May 2010 at 7:31

@GoogleCodeExporter
Copy link
Author

Hi

Getting this error when running the script:
    print <event date="%d" filename="%s" author="%s" /> % (d['Date'], d['Element'], d['User'])
                    ^
SyntaxError: invalid syntax

What could be wrong? 

thanks

Original comment by blomqvis...@gmail.com on 24 Jun 2010 at 1:59

@GoogleCodeExporter
Copy link
Author

It looks like you are missing the single quotes around the first part.

You have:
   print <event date="%d" filename="%s" author="%s" /> % (d['Date'], d['Element'], d['User'])

and it should be:
   print '<event date="%d" filename="%s" author="%s" />' % (d['Date'], d['Element'], d['User'])

Original comment by Dudley....@gmail.com on 24 Jun 2010 at 8:26

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant