lag_calc unexpected keyword argument #71

TraziBatine · 2016-11-21T08:35:57Z

Hi!

I am trying to get catalog of lag_calc events but I have a problem with it.
On the link you can find the script, one day data and templates.

https://drive.google.com/open?id=0BzMcKedS7yadZjY4Zkk4U1hEOFE

The error that I get is:
TypeError: lag_calc() got an unexpected keyword argument 'horizontal_chans'

lines for lag_calc are at the end of the script.

Why am I getting this error?

Thanks,
blaz

calum-chamberlain · 2016-11-21T08:53:59Z

You are looking at the docs for the develop branch, but have the current release installed. horizontal_chans is not in a released version. Either not use this argument, or, if you trust the develop branch (there shouldn't be bugs) then come the repo and install that branch using:
git checkout develop
pip install . -U

TraziBatine · 2016-11-21T08:58:12Z

Awesome, thanks, I will try with dev.
Maybe do you know if I can expect any differences in using this two arguments or not?

thank you,
blaz

calum-chamberlain · 2016-11-21T09:00:07Z

The current master uses the defaults in develop, the new options just give you more control for non standard networks.

TraziBatine · 2016-11-23T14:28:54Z

Hi! I found another error that I dont know what it is.

So, basicly my scipt runs without problems - on first day I get some detections and lag_calc does its work as intendend, while on the next day, it dies with:

Exception in thread Thread-123:
Traceback (most recent call last):
File "/home/blaz/anaconda2/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/home/blaz/anaconda2/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/blaz/anaconda2/lib/python2.7/multiprocessing/pool.py", line 389, in _handle_results
task = get()
TypeError: ('init() takes exactly 2 arguments (1 given)', <class 'eqcorrscan.core.lag_calc.LagCalcError'>, ())

on the link, you can find the day of sucessful detection and sucessful lag_calc and the next day that brakes the code.
https://drive.google.com/open?id=0BzMcKedS7yadN09qT29Lc3pWRTA

Any idea?
Thanks!

calum-chamberlain · 2016-11-23T18:28:03Z

It's an error that happens running in parallel, but didn't get raised properly because it is running in parallel, try setting parallel=False. The error message should be raised properly.

TraziBatine · 2016-11-24T10:00:25Z

yes, thanks, this works...now I get this error:

LagCalcError: lag-calc has decreased cccsum from 3.724045 to 2.530662 - report this error
I can just try/except it, but I would like to know what is happening here, preferably still keeping the lag_calced event.

thanks

calum-chamberlain · 2016-11-24T21:55:09Z

You should check that your are using the save data for lag calc that you used for match filtering. Running lag calc should never decrease the cccsum if the data are correct.

…

Sent from my Vodafone Smart On Nov 24, 2016 11:00 PM, blazvicic <notifications@github.com> wrote: yes, thanks, this works...now I get this error: LagCalcError: lag-calc has decreased cccsum from 3.724045 to 2.530662 - report this error I can just try/except it, but I would like to know what is happening here, preferably still keeping the lag_calced event. thanks - You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#71 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AKaGcxNcXIZUDDAr2LO9IuIAYjRrxOgXks5rBWA6gaJpZM4K37a2>.

TraziBatine · 2016-11-25T08:00:17Z

      detections = match_filter.match_filter(template_names=template_files,
                                             template_list=templates, st=st,
                                             threshold=9, threshold_type='MAD',
                                             trig_int=3, plotvar=False, plotdir=('./plots'),
                                             cores=4)

      ####
      for detection in detections:
           detection.write("detections/"+day+'_detections', append=True)


      ####
      unique_detections = []
    
      for master in detections:
          if float(master.threshold) != 0 and (abs(float(master.detect_val))/float(master.threshold) > (1.1)):
              keep = True
              for slave in detections:
                  if not master == slave and\
                      abs(master.detect_time - slave.detect_time) <= 6.0:
                      # If the events are within 6s of each other then test which
                      # was the 'best' match, strongest det
                      if not master.detect_val > slave.detect_val:
                          keep = False
                          break
              if keep:
                  unique_detections.append(master)
    
      for detw in unique_detections:
           detw.write("detections/6s/"+day+'_detections_unique', append=True)
           print detw.template_name

    ####
    cat_cc = lag_calc.lag_calc(detections=unique_detections, detect_data=st, template_names=template_files, templates=templates, shift_len=0.2, min_cc=0.4, cores=4, interpolate=False, plot=False, parallel=False)

    if not os.path.isdir("events_cc/") == True:
        os.system("mkdir events_cc")

    
    for i, event in enumerate(sorted(cat_cc)):
        event.write("events_cc/"+str(i)+".xml", format="QUAKEML")
        event.write("events_cc/"+str(i)+".obs", format="NLLOC_OBS")

this is my script... the only thing here that I would understand as "not being the same data" is that I am trying to get lag_calc for unique_detections, but if i try this for detections, the same error occurs.

or the error comes from "template_names=template_files, templates=templates" part of lag_calc?
I did sorted(template_files), but if I try to sorted(templates), it says that this is not possible.

Thanks!

calum-chamberlain · 2016-11-25T08:55:44Z

I will try and have a look at this next week, weekend off after earthquake response here in nz, sorry I can't help straight away!

…

Sent from my phone On Nov 25, 2016 9:00 PM, blazvicic <notifications@github.com> wrote: detections = match_filter.match_filter(template_names=template_files, template_list=templates, st=st, threshold=9, threshold_type='MAD', trig_int=3, plotvar=False, plotdir=('./plots'), cores=4) #### for detection in detections: detection.write("detections/"+day+'_detections', append=True) #### unique_detections = [] for master in detections: if float(master.threshold) != 0 and (abs(float(master.detect_val))/float(master.threshold) > (1.1)): keep = True for slave in detections: if not master == slave and\ abs(master.detect_time - slave.detect_time) <= 6.0: # If the events are within 6s of each other then test which # was the 'best' match, strongest det if not master.detect_val > slave.detect_val: keep = False break if keep: unique_detections.append(master) for detw in unique_detections: detw.write("detections/6s/"+day+'_detections_unique', append=True) print detw.template_name #### cat_cc = lag_calc.lag_calc(detections=unique_detections, detect_data=st, template_names=template_files, templates=templates, shift_len=0.2, min_cc=0.4, cores=4, interpolate=False, plot=False, parallel=False) if not os.path.isdir("events_cc/") == True: os.system("mkdir events_cc") for i, event in enumerate(sorted(cat_cc)): event.write("events_cc/"+str(i)+".xml", format="QUAKEML") event.write("events_cc/"+str(i)+".obs", format="NLLOC_OBS") this is my script... the only thing here that I would understand as "not being the same data" is that I am trying to get lag_calc for unique_detections, but if i try this for detections, the same error occurs. or the error comes from "template_names=template_files, templates=templates" part of lag_calc? I did sorted(template_files), but if I try to sorted(templates), it says that this is not possible. Thanks! - You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub<#71 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AKaGc9z_SJhaAc5LF8Frv3PzDkoSRRGRks5rBpWRgaJpZM4K37a2>.

TraziBatine · 2016-11-25T08:57:27Z

sure, no problem, take your time.

thanks

calum-chamberlain · 2016-11-29T06:05:51Z

Hi @blazvicic just getting to have a look at this - the above script is either not formatted correctly or missing top lines? Can you check and repost please?

TraziBatine · 2016-11-29T06:52:16Z

sure... here it is:

from eqcorrscan.core import match_filter, lag_calc
from eqcorrscan.utils import pre_processing
from eqcorrscan.utils.plotting import detection_multiplot
from obspy import read, Catalog
import glob
import os

outfile = open("failed_days", "w")

# Read in the templates
template_path = 'templates/'
template_files = glob.glob(os.path.join(template_path, '*'))
template_files = sorted(template_files)

templates = []

for template_file in template_files:
    templates.append(read(template_file))
for template in templates:

    for tr in template:
        # Change channel names to two letters because EQcorrscan uses Seisan
        # style channel names, which are not standard, and should be changed!
        # Note that I might change EQcorrscan to use proper three letter
        # chanel names in a future release, but I will ensure it works with
        # two letter channel names too.
        tr.stats.channel = tr.stats.channel[0] + tr.stats.channel[-1]
        #pre_processing.shortproc(st=tr, lowcut=3.0, highcut=6.0, filt_order=3, samp_rate=20)

days =os.listdir('./24h/')
    
      
for day in sorted(days):
  # Create detections and detections/6s folder that are used for detection files
  if not os.path.isdir("detections/") == True:
    os.system("mkdir detections")
    os.system("mkdir detections/6s")
    
  print "Days with data: ", sorted(days)
  print "Number of days with data: ", len(days)
  print "Match filter started for day: ", day

  # Read in and process the daylong data
  # There is something wrong with the way, Antelope is setting sampling rate to the miniseeds...
  # This is why you need to manualy set to 200Hz - miniseeds are at 199.99Hz.
  try:
    st = read('24h/'+day+'/*')
  
    print "Setting sampling rates of the traces"
    for tr in st:
      tr.stats.channel = tr.stats.channel[0] + tr.stats.channel[-1]
      if not tr.stats.station == "VINO": 
        if tr.stats.sampling_rate != 200.0:
          tr.stats.sampling_rate=200.0
      else:
        if tr.stats.sampling_rate != 100.0:
          tr.stats.sampling_rate = 100.0
    print "Merging traces"
    st = st.merge(method=1, fill_value=0)
    print "Detrending traces"
    st = st.detrend('constant')
    # Remove traces that are shorter then 17280000 samples (otherwise this will not work). What can be done?
    print "Removing traces shorter then 17270000 samples"
     
    for tr in st:
      if not tr.stats.station == "VINO":
        if len(tr) < 17270000:
          st.remove(tr)
          print "trace with less then 17280000 and not VINO:", tr.stats.station
      else:
        if len(tr) < 8640000:
          st.remove(tr)
          print "trace with less then 8640000 samples removed:", tr.stats.station  
    # Some streams will be empty after st.remove, so we cant do anything with them...
    if len(st) == 0:
      pass
    else:


      # Use the same filtering and sampling parameters as your template!
      print "Preprocessing stream"
      #st = pre_processing.shortproc(st, lowcut=3.0, highcut=6.0, filt_order=3,
      #                    samp_rate=20,
      #                    starttime=st[0].stats.starttime)
      # Forced Daylong
      st = pre_processing.dayproc(st, lowcut=3.0, highcut=6.0, filt_order=3,
                          samp_rate=20,
                          starttime=st[0].stats.starttime)



      print(st.__str__(extended=True))            
       
      ####
      print "Starting with match filter"
      detections = match_filter.match_filter(template_names=template_files,
                                             template_list=templates, st=st,
                                             threshold=9, threshold_type='MAD',
                                             trig_int=3, plotvar=False, plotdir=('./plots'),
                                             cores=4)

      ####
      for detection in detections:
           detection.write("detections/"+day+'_detections', append=True)


      ####
      unique_detections = []
    
      for master in detections:
          if float(master.threshold) != 0 and (abs(float(master.detect_val))/float(master.threshold) > (1.1)):
              keep = True
              for slave in detections:
                  if not master == slave and\
                      abs(master.detect_time - slave.detect_time) <= 6.0:
                      # If the events are within 6s of each other then test which
                      # was the 'best' match, strongest det
                      if not master.detect_val > slave.detect_val:
                          keep = False
                          break
              if keep:
                  unique_detections.append(master)
    
      for detw in unique_detections:
           detw.write("detections/6s/"+day+'_detections_unique', append=True)
           print detw.template_name

    ####
      cat_cc = lag_calc.lag_calc(detections=unique_detections, detect_data=st, template_names=template_files, templates=templates, shift_len=0.2, min_cc=0.4, cores=4, interpolate=False, plot=False, parallel=False)

      if not os.path.isdir("events_cc/") == True:
          os.system("mkdir events_cc")

    
      for i, event in enumerate(sorted(cat_cc)):
          event.write("events_cc/"+str(i)+".xml", format="QUAKEML")
          event.write("events_cc/"+str(i)+".obs", format="NLLOC_OBS")

  except TypeError:
    print >>outfile, day

on the link you can find one day data, templates and the script in case if c/p is not correctly formated.
https://drive.google.com/open?id=0BzMcKedS7yadUzNGSmxqSThxNnM

Thank you for your help and time

calum-chamberlain · 2016-11-30T04:03:48Z

I created a gist of this here:

https://gist.github.com/calum-chamberlain/f4a8fb41f51dca9f91dd8fb03205ae48

And will edit this.

calum-chamberlain · 2016-12-11T21:18:20Z

Hey @blazvicic sorry I have been so slow with this one - I'm hoping to have a proper look and test in the next couple of days. Just wanted you to know that I haven't forgotten!

TraziBatine · 2016-12-12T07:08:14Z

Sure thing no problem! I am already getting interesting results without lag-calc, but that will add some value :)

Thanks

---edited
Hi @calum-chamberlain,
anything new regarding the problem I had with lag-calc?
I am looking forward to it, thanks,
Blaz

calum-chamberlain · 2017-01-09T00:50:49Z

Hey Blaz,
Sorry, all sorts of things got in the way, back to this now - I can replicate the issue, but I haven't tracked it down yet.
C

calum-chamberlain · 2017-01-09T02:56:53Z

Looks like a strange issue in the depths of the correlations, it is going to take a while to work this one out.

TraziBatine · 2017-01-09T07:50:11Z

Thanks.

If there is someway I can help I would be gladd to do it... I get the same error also on other days (if I use more then one template and more then one template makes detections).

Cheers,
blaž

calum-chamberlain · 2017-01-10T01:04:34Z

Hi Blaz,

I have ruled out a few things, it's not an issue related to multiple templates being used, the issue only seems to be there when station VINO is included...

Looking more closely at the data, VINO has a significant artefact in it (there are a couple of spikes up to 10e7, which are the issue. Removing those spikes results in no issue.

So, to explain:
The correlation routine used in EQcorrscan works in the frequency domain, so it does a Fourier transform of the template and the continuous data and convolves them to get the cross-correlation. Because the spikes in the VINO data are spikes, they contain (nearly) all frequencies, which, I think is damaging the FFT (fast Fourier transform) routine used by openCV (which is the image processing library we use for the correlations) and giving the incorrect result for the cross-correlations.

You can see this if you plot up the correlation of the template with one channel of VINO data - the correlation values after the spikes are much higher amplitude, and appear clipped. However if you just correlate the template with the same data period after the spikes (as done in lag_calc) the amplitudes are much lower (which is the cause of the error raised).

So...
VINO's spikes are the cause of the issue, removing the spikes removes the issue, but I don't immediately know how to catch this case automatically. I haven't run into this before hence the (very) slow debugging process (sorry). I will try to think of a way to catch this automatically, but in the mean-time, manually checking that your data are real would be good.

calum-chamberlain · 2017-01-10T01:37:54Z

Looks like it's not a spike, but a box, around 20s long in the raw data, which ends up generating two spikes when filtered.

TraziBatine · 2017-01-10T17:40:36Z

thanks for your time with that and comments about the problem.
anyway, I tried my script with some other day for which data was OK and the results are what they should be.
I tried relocating the detected events, and locations are consistant with manual P,S picking and locating.

the "near real time" works on my pc now :)

calum-chamberlain · 2017-01-11T00:38:30Z

near real-time? Would be curious to hear more about what you are up to!

TraziBatine · 2017-01-11T05:51:14Z

i am sorry, nothing fancy. That is why " "...
Its a crontab job every hour for 2 small arrays.

although we did discuss at the departement that this would indeed be a good thing...but it did not go fourther then this. if we go into it, i will discuss it with you, since i saw your plan is the same.

does obspy have modul for continuous stream?
edited:
ah, yes, there is realtime module

calum-chamberlain · 2017-01-11T07:47:16Z

Cool, I have never got round to doing that, but always wanted to!

calum-chamberlain added docs wontfix labels Nov 21, 2016

calum-chamberlain closed this as completed Nov 21, 2016

calum-chamberlain reopened this Nov 29, 2016

calum-chamberlain removed docs wontfix labels Dec 11, 2016

calum-chamberlain mentioned this issue Jan 10, 2017

match_filter cross-correlation is incorrect for data with large spike artefacts #75

Closed

calum-chamberlain mentioned this issue Jan 10, 2017

Check for spikes and raise error if found. #77

Merged

calum-chamberlain closed this as completed Feb 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lag_calc unexpected keyword argument #71

lag_calc unexpected keyword argument #71

TraziBatine commented Nov 21, 2016 •

edited

calum-chamberlain commented Nov 21, 2016 •

edited

TraziBatine commented Nov 21, 2016

calum-chamberlain commented Nov 21, 2016 •

edited

TraziBatine commented Nov 23, 2016 •

edited

calum-chamberlain commented Nov 23, 2016 •

edited

TraziBatine commented Nov 24, 2016

calum-chamberlain commented Nov 24, 2016 via email •

edited

TraziBatine commented Nov 25, 2016

calum-chamberlain commented Nov 25, 2016 via email

TraziBatine commented Nov 25, 2016

calum-chamberlain commented Nov 29, 2016

TraziBatine commented Nov 29, 2016 •

edited by calum-chamberlain

calum-chamberlain commented Nov 30, 2016 •

edited

calum-chamberlain commented Dec 11, 2016

TraziBatine commented Dec 12, 2016 •

edited

calum-chamberlain commented Jan 9, 2017

calum-chamberlain commented Jan 9, 2017

TraziBatine commented Jan 9, 2017

calum-chamberlain commented Jan 10, 2017

calum-chamberlain commented Jan 10, 2017 •

edited

TraziBatine commented Jan 10, 2017

calum-chamberlain commented Jan 11, 2017

TraziBatine commented Jan 11, 2017 •

edited

calum-chamberlain commented Jan 11, 2017

lag_calc unexpected keyword argument #71

lag_calc unexpected keyword argument #71

Comments

TraziBatine commented Nov 21, 2016 • edited

calum-chamberlain commented Nov 21, 2016 • edited

TraziBatine commented Nov 21, 2016

calum-chamberlain commented Nov 21, 2016 • edited

TraziBatine commented Nov 23, 2016 • edited

calum-chamberlain commented Nov 23, 2016 • edited

TraziBatine commented Nov 24, 2016

calum-chamberlain commented Nov 24, 2016 via email • edited

TraziBatine commented Nov 25, 2016

calum-chamberlain commented Nov 25, 2016 via email

TraziBatine commented Nov 25, 2016

calum-chamberlain commented Nov 29, 2016

TraziBatine commented Nov 29, 2016 • edited by calum-chamberlain

calum-chamberlain commented Nov 30, 2016 • edited

calum-chamberlain commented Dec 11, 2016

TraziBatine commented Dec 12, 2016 • edited

calum-chamberlain commented Jan 9, 2017

calum-chamberlain commented Jan 9, 2017

TraziBatine commented Jan 9, 2017

calum-chamberlain commented Jan 10, 2017

calum-chamberlain commented Jan 10, 2017 • edited

TraziBatine commented Jan 10, 2017

calum-chamberlain commented Jan 11, 2017

TraziBatine commented Jan 11, 2017 • edited

calum-chamberlain commented Jan 11, 2017

TraziBatine commented Nov 21, 2016 •

edited

calum-chamberlain commented Nov 21, 2016 •

edited

calum-chamberlain commented Nov 21, 2016 •

edited

TraziBatine commented Nov 23, 2016 •

edited

calum-chamberlain commented Nov 23, 2016 •

edited

calum-chamberlain commented Nov 24, 2016 via email •

edited

TraziBatine commented Nov 29, 2016 •

edited by calum-chamberlain

calum-chamberlain commented Nov 30, 2016 •

edited

TraziBatine commented Dec 12, 2016 •

edited

calum-chamberlain commented Jan 10, 2017 •

edited

TraziBatine commented Jan 11, 2017 •

edited