Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable string encoding for SAC files #1773

Merged
merged 1 commit into from May 12, 2017
Merged

Variable string encoding for SAC files #1773

merged 1 commit into from May 12, 2017

Conversation

lijunzh
Copy link
Contributor

@lijunzh lijunzh commented May 2, 2017

This is an attempt to add "encoding" flag to obspy.io.sac.satrace.read() function in order to read some SAC that has encoding other than "ASCII".

PR Checklist

  • All tests still pass.
  • Any new features or fixed regressions are be covered via new tests.
  • Any new or changed features have are fully documented.
  • Significant changes have been added to CHANGELOG.txt .
  • First time contributors have added your name to CONTRIBUTORS.txt .

Closes #1768.

@@ -171,7 +171,7 @@ def _clean_str(value, strip_whitespace=True):
Trace that the user may have manually added.
"""
try:
value = value.decode('ASCII', 'replace')
value = value.decode(encode, 'replace')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a good idea to make this something like..

try:
    value = value.decode(encoding, 'strict')
except UnicodeError:
    msg = ('Encountered characters that can not be '
           'decoded with encoding {}: {}\nThese characters '
           'are ignored.'.format(encoding, value))
    warnings.warn(msg)
    value = value.decode(encoding, 'replace')
except AttributeError:
    ...

if val.startswith(native_str('-12345')):
val = HD.SNULL
hs[i] = val
hs[i] = val.encode('ascii', 'replace')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't follow this logic, why always encode to ASCII when the user might have specified a different encoding?

@megies
Copy link
Member

megies commented May 4, 2017

So right now you''re only handling reading files with a different encoding? I assume the idea is to regard files with an encoding different then ASCII as illegal files and only facilitate reading them while any output will be encoded to ASCII? I can follow that logic but then there should be an warning message I think when a user specifies a different encoding but output is ASCII encoded..

@lijunzh
Copy link
Contributor Author

lijunzh commented May 4, 2017

Thank @megies for reviewing the pull request. I will answer your two review comments and one last comment together here.

  1. As @megies mentioned previously

the main point is whether SAC is allowing an encoding different from ASCII. I only see it mentioned as "alphanumeric" in the SAC manual, so my guess is.. no.

I think that we should always reinforce the alphanumeric idea by using only ASCII encoding. This also helps the encoding problem between py2 and py3. (e.g. You will have an error when decoding special characters encoded in UTF-8 using ASCII, but not the other way around) Since

A UTF-8 file that contains only ASCII characters is identical to an ASCII file.

from Wikipedia

  1. Yes, I am kind of regarding any file encoded other than ASCII an illegal file. But by supporting the 'encode' flag, we give users the ability to read the content. However, I don't think any station/channel names with non-ASCII character is much useful (except the case when they are written in a different language, e.g. Chinese, Arabic, etc.) So, I am in favor of replacing those characters by '?' instead of passing them along to the stream object. More importantly, I don't want to let users write to a SAC that contains that kind of characters which may cause problems when they process it in a different program or pass to others. I would like Obspy only output/write to a SAC file in ASCII encoding.

  2. If you guys think that we should support "UTF-8" encoding throughout the obspy.io.sac, I would say that the solution in PR is a cheap band-aid sticker that we should avoid. However, it will involve changing a lot of codes in all places which is out of the scope of this PR. We can start a different PR to address that problem instead.

  3. The idea of the warning message is great. My initial thoughts are that when the user see a bunch of '????' in their stream object, they will know something wrong with the encoding processing. But throwing out an explicit warning message is a better way to go. I will change the PR accordingly.

@megies
Copy link
Member

megies commented May 4, 2017

I tend to agree with the general idea outlined above.

Default:

  • read and decode textual headers as ASCII, show useful error message explaining the option to specify a different encoding but that the file should be considered illegal (show that message when a UnicodeError woould get raised)
  • write and encode as ASCII

Spefifying an encoding:

  • use that encoding to read/decode textual headers
  • write and encode as ASCII and show warning message explaining the situation

@lijunzh
Copy link
Contributor Author

lijunzh commented May 4, 2017

By the way, the current testing file test_encode.sac only has special characters in its textual headers. It will be nice if we can have a SAC file that has only alphanumeric characters but encoded in an encoding other than ASCII. This will allow us to test the encoding conversion though it should be any problem as far as I am reading the code.

@jkmacc-LANL
Copy link
Contributor

To reiterate what I said in the issue, changing a string header from unknown to '?????' by default is different behavior from the way the SAC program behaves (which seems to allow arbitrary byte strings to pass through unchanged). This may trip up some users, and I suspect will result in bug reports. Also, I think our current test suite has some assumptions that a file can be read and re-written unchanged.

@megies
Copy link
Member

megies commented May 4, 2017

Right now byte strings can also not pass through unchanged.. and we never claim to be 100% sac compatible, so I don't see that as a show stopper. Showing meaningful warnings in these exotic and probably sac-format-breaching cases is enough I would say..

@megies
Copy link
Member

megies commented May 4, 2017

I think our current test suite has some assumptions that a file can be read and re-written unchanged

I don't think so, as it looks like you're calling this string cleanup method that will always decode..

@lijunzh
Copy link
Contributor Author

lijunzh commented May 4, 2017

the way the SAC program behaves (which seems to allow arbitrary byte strings to pass through unchanged)

If we want to achieve that, we may need a class attribute that store the original headers in bytes. An alternative way to go is to record the user inputted encoding flag and use that when writing to SAC. Either way will involve changing the code structure which I am not comfortable to do as a newcomer to Obspy.

our current test suite has some assumptions that a file can be read and re-written unchanged.

I double we have some testing suite that checks the integrity of SAC file (unchanged after read-write process). My tweaks to the code pass the test properly which clearly does not hold that principle.

@lijunzh
Copy link
Contributor Author

lijunzh commented May 4, 2017

Can anyone give me some advice here? The Travis CI throw some tests fail like:

FAIL: test_catalog_plot_ortho (obspy.core.tests.test_event.CatalogBasemapTestCase)

which I didn't even mess with. All changes I made were in obspy.io.sac folder. I am not sure what went wrong here.

@jkmacc-LANL and @megies, Maybe keep ASCII encoding by default messed some pre-existing assumption?

@jkmacc-LANL
Copy link
Contributor

@lijunzh That failure may be unrelated. If you see it when you run the test suite on the master branch, than it's not from you. I think you can look for that failure on other PRs, too, to see if it's from your changes.

@megies
Copy link
Member

megies commented May 5, 2017

My guess is that these image test fails come from slight changes in new matplotlib 2.0.1 which was finally released a few days ago. Baseline images were created some weeks ago on matplotlib 2.0.x branch..

CC @krischer @QuLogic

@megies
Copy link
Member

megies commented May 5, 2017

@lijunzh but there's also some "real" fails in io.sac: http://tests.obspy.org/79829/#1

@megies
Copy link
Member

megies commented May 5, 2017

In any case, I'm leaving this to @jkmacc-LANL to decide how to tackle this encoding issue in io.sac.

@megies megies requested a review from jkmacc-LANL May 5, 2017 08:46
@megies megies added the .io.sac label May 5, 2017
@megies megies added this to the 1.1.0 milestone May 5, 2017
@krischer
Copy link
Member

krischer commented May 5, 2017

My guess is that these image test fails come from slight changes in new matplotlib 2.0.1 which was finally released a few days ago. Baseline images were created some weeks ago on matplotlib 2.0.x branch..

CC @krischer @QuLogic

Seems to be that the clipping of the edges is slightly different - we can just up the tolerance a tiny bit - no need to recreate the test images IMHO. I'll do it in a separate PR.

@@ -1100,7 +1100,7 @@ def reftime(self, new_reftime):
# --------------------------- I/O METHODS ---------------------------------
@classmethod
def read(cls, source, headonly=False, ascii=False, byteorder=None,
checksize=False, debug_strings=False):
checksize=False, debug_strings=False, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using kwargs - can you please directly specify the argument name in the function definition? This is IMHO much easier to read and understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's actually easier for me to code. I was trying to keep all the code unchanged until absolutely needed. If you guys have no problem with adding a new parameter to read in obspy.io.sac.sactrace and change the calling statement in _internal_read_sac in obspy.io.sac.core, I will be more than happy to do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great :) Public methods (not starting with underscore) should be backwards compatible but just adding a new argument that defaults to the old behavior of course satisfies that constraint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I will change the code accordingly.

@krischer krischer changed the title Fixes #1768 Variable string encoding for SAC files May 5, 2017
@lijunzh
Copy link
Contributor Author

lijunzh commented May 5, 2017

Can anyone give some insight what this test is for?

    def test_reftime_incomplete(self):
        """
        Replacement for SACTrace._from_arrays doctest which raises UserWarning
        """
        sac = SACTrace._from_arrays()
        self.assertTrue(sac.lcalda)
        self.assertFalse(sac.leven)
        self.assertFalse(sac.lovrok)
        self.assertFalse(sac.lpspol)
        self.assertEqual(sac.iztype, None)
        self.assertRaises(SacHeaderTimeError, getattr, sac, 'reftime')
        # raises "UserWarning: Reference time information incomplete"
        with warnings.catch_warnings(record=True) as w:
            warnings.simplefilter('always', UserWarning)
            str(sac)
            self.assertEqual(len(w), 1)
            self.assertIn("Reference time information incomplete", str(w[0]))

My code fails at

self.assertEqual(len(w), 1)

in which I don't know why we want to keep len(w) equals 1.

@krischer
Copy link
Member

krischer commented May 5, 2017

It is probably useful to either jump to the line with a debugger or just print the caught warnings. Maybe some other warning got triggered as well? Tests usually assert the number of caught warnings as they intentionally trigger only one warning.

@lijunzh
Copy link
Contributor Author

lijunzh commented May 5, 2017

I kind of throw multiple warnings for different content when using the "encoding" flag. It will first warn user that even they specify an encoding, it will use ASCII as output. The second one tell user that what characters are replaced by "?" when conversion happens. It that triggers this test to fail, I can merge them.

@jkmacc-LANL
Copy link
Contributor

jkmacc-LANL commented May 5, 2017

This test is to verify that the user gets exceptions when they explicitly request time header information that isn't there, and to warn them with they implicitly request it (str(sac)). I only expect one warning, but since you're now expecting more than one, I think it's fair to change this test.

EDIT: Actually, I guess I don't see why this test is failing. SACTrace._from_arrays() should initialize a very valid blank header.

@lijunzh
Copy link
Contributor Author

lijunzh commented May 5, 2017

In fact, when I tested on my machine, I don't see this fail. It correctly delivers only one warning:

'Reference Time = XX/XX/XX (XXX) XX:XX:XX.XXXXXX\n\tiztype not set\nlcalda     = True\nleven      = False\nlovrok     = False\nlpspol     = False'
/Users/lijun/anaconda3/lib/python3.6/site-packages/obspy/io/sac/sactrace.py:1454: UserWarning: Reference time information incomplete.
  warnings.warn(msg)

More importantly, Travis CI shows that it only fails on py27 and py33. I then created an env for py27 and repeat the test. Here are my results:

(py27) lijun@Lijuns-MacBook-Pro:~$ python
Python 2.7.13 |Continuum Analytics, Inc.| (default, Dec 20 2016, 23:05:08)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import obspy
>>> sac = obspy.io.sac.sactrace.SACTrace._from_arrays()
>>> sac
SACTrace(lcalda=1, leven=0, lovrok=0, lpspol=0)
>>> str(sac)
/Users/lijun/anaconda3/envs/py27/lib/python2.7/site-packages/obspy/io/sac/sactrace.py:1441: UserWarning: Reference time information incomplete.
  warnings.warn(msg)
'Reference Time = XX/XX/XX (XXX) XX:XX:XX.XXXXXX\n\tiztype not set\nlcalda     = True\nleven      = False\nlovrok     = False\nlpspol     = False'
>>> import warnings
>>> with warnings.catch_warnings(record=True) as w:
...     warnings.simplefilter('always', UserWarning)
...
>>> len(w)
0
>>> type(w)
<type 'list'>
>>> w
[]

Apparently, py27 treat w as an empty list. This leads me to think that maybe the warning.catch_warning behavior changes after py33. But this is not caused by the changes made in this PR.

@jkmacc-LANL Maybe we should fix this problem in a different PR. You may have more insight in this test.

Copy link
Contributor

@jkmacc-LANL jkmacc-LANL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty darn good PR; thanks! I have just a few relatively minor suggestions.

@@ -381,7 +381,12 @@ def _internal_read_sac(buf, headonly=False, debug_headers=False, fsize=True,
:return: A ObsPy Stream object.
"""
# read SAC file
sac = SACTrace.read(buf, headonly=headonly, ascii=False, checksize=fsize)
if 'encoding' in kwargs.keys():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of repeating the SACTrace.read call in branches, maybe you can just do a single SACTrace.read(..., kwargs.get('encoding', 'ASCII').

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know that we can do this. It will be much cleaner.

@@ -162,7 +162,7 @@ def is_same_byteorder(bo1, bo2):
return (bo1.lower() in le) == (bo2.lower() in le)


def _clean_str(value, strip_whitespace=True):
def _clean_str(value, encoding='ASCII', strip_whitespace=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move the encoding keyword to the end, to preserve the original ordering of the strip_whitespace argument.

@@ -381,7 +381,12 @@ def _internal_read_sac(buf, headonly=False, debug_headers=False, fsize=True,
:return: A ObsPy Stream object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encoding keyword may also need to be documented here? Maybe overkill, but the others are too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have second thoughts here. I don't think we want to make "encoding" flag a standard parameter nor we want to encourage users to use it. Its existence is due to the fact that bad SAC files are created when people passing files from one system to another. When they are facing this encoding problem like I have (i.e. someone maybe read the file in Windows system and didn't encode it correctly before writing it to SAC. That's why I have a lot of characters in CP 1252), we have a backup plan to guide them using an "encoding" flag by throwing a user warning. I did document the "encoding" parameter in obspy.io.sac.sactrace which I regard as the lowest level that we can hide its existence.

 :param encoding: By default, ASCII encoding is used. ASCII-characters
            encoded in a different encoding scheme will be converted to ASCII
            while other special characters will be replaced by '?'.
 :type encoding: str

If you guys think it is better to have it as a standard parameter, I have no problem with it. But it seems to be unnecessary changes to the documents and Obspy functionality that most people will not use nor they want to konw about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought. I'm OK with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's very nice of you. I will go ahead and change the rest of the PR and make a push now.

"All non-ASCII characters will be replaced by "
"?".format(encoding))
warnings.warn(msg)
val = _ut._clean_str(val, encoding=encoding,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swap position of encoding and strip_whitespace arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. The parameter positions need to be consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, the order doesn't technically matter here.

@@ -1231,7 +1248,7 @@ def _from_arrays(cls, hf=None, hi=None, hs=None, data=None):
.. rubric:: Example

>>> sac = SACTrace._from_arrays()
>>> print(sac) # doctest: +NORMALIZE_WHITESPACE +SKIP
>>> print(sac) # doctest: +NORMALIZE_WHITESPACE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this test is giving the PR problems, and it was previously skipped, maybe just put the +SKIP back in and I'll make a new PR for it.

@megies
Copy link
Member

megies commented May 11, 2017

Looking at the reflog you posted, this is the state before the rebase that went wrong, I think:
https://github.com/obspy/obspy/commits/637696bec

Personally, I would recommend to just raze your branch, start again from current master and manually apply the diff of this page (i.e. the final state):
master...637696b

@megies Attached is the output of that command. I did rebase multiple times so that log is messed with all kinds of "rebase" history.

yeah.. but it's really pretty simple.. just search for lines "checkout" because before rebasing at some point you needed to switch to that branch.. :-)

@lijunzh
Copy link
Contributor Author

lijunzh commented May 11, 2017

@megies I did try to erase the branch and start all over again. But that automatically close my PR and I asked how to reopen it before. I was only able to reopen my PR once I restore my old branch and then the rebase problem came back.

@megies
Copy link
Member

megies commented May 11, 2017

@megies I did try to erase the branch and start all over again. But that automatically close my PR and I asked how to reopen it before. I was only able to reopen my PR once I restore my old branch and then the rebase problem came back.

just do..

git checkout master
git pull
git branch -D sac_string_encoding_flag
git checkout -b sac_string_encoding_flag
# do changes and commit them
git push -f sac_string_encoding_flag

@lijunzh lijunzh closed this May 11, 2017
@lijunzh
Copy link
Contributor Author

lijunzh commented May 11, 2017

@megies I did what you said here:

 git checkout master
git pull
git branch -D sac_string_encoding_flag
git checkout -b sac_string_encoding_flag
# do changes and commit them
git push -f sac_string_encoding_flag

And it says

Pull request successfully merged and closed
You’re all set—the lijunzh:sac_string_encoding_flag branch can be safely deleted.

However, I can't see my changes on master branch. Could you please let me know what happened?


EDIT:

I realized that I didn't commit the changes which only sit in my stash. I added those changes back and pushed a new commit:

Fix string encoding problem of reading SAC files. 865fd6b

Somehow, the Travis-ci and Circle-ci test were canceled. But the code changes should be fine. They have passed those test multiple times

@megies
Copy link
Member

megies commented May 11, 2017

I'm on the move now.. maybe somebody on our gitter channel can give further advice, otherwise I'll do that change tomorrow.

https://gitter.im/obspy/obspy

@lijunzh lijunzh reopened this May 11, 2017
@lijunzh
Copy link
Contributor Author

lijunzh commented May 11, 2017

@megies @jkmacc-LANL I think this PR is now ready for merge. There is only one commit from it which capture all changes I made for the minimal-change version.

@@ -1092,7 +1092,7 @@ def read(cls, source, headonly=False, ascii=False, byteorder=None,
val = _ut._clean_str(val, strip_whitespace=False)
if val.startswith(native_str('-12345')):
val = HD.SNULL
hs[i] = val
hs[i] = val.encode('ASCII', 'replace')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make sense to me; when reading, why would you want bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment. hs[i] was an array of bytes. So, the previous statement will make the implicit conversion which is equivalent to hs[i] = val.encode('ASCII', 'strict'). However, when header string is not encoded in 'ASCII', it causes a problem and thus raises an exemption. This PR was trying to let the non-ASCII characters pass the read function for now as '?' and leave the implementation of a 'encoding' flag for the later version as we discussed before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_ut._clean_str produces a str; encode goes from str to bytes; I think you might have something backwards. Are you testing on Python 3?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, upon further investigation, this works because SAC I/O is not very good at defining boundaries and does encode/decode a bit too much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other thing I didn't know is that NumPy does an implicit encode here when assigning to hs[i] (even on Python 3, probably for backwards compatibility reasons, but it's kind of unfortunate.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree with you. This has given us a lot problems in the past.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QuLogic Yep.. it's a bit strange that we store it internally as bytes again (in numpy) after first decoding it. But we kind of decided to make this PR a minimal fix and postpone the more major cleanup for later..

@QuLogic
Copy link
Member

QuLogic commented May 11, 2017

The latest Travis build is here but for some reason it's not linking it right.

@QuLogic
Copy link
Member

QuLogic commented May 11, 2017

So even GitHub is confused here: git fetch upstream pull/1773/head && git log -1 FETCH_HEAD:
commit 85167d0 (that's not the latest commit here!)

Maybe a reopen will fix it...

@QuLogic QuLogic closed this May 11, 2017
@QuLogic QuLogic reopened this May 11, 2017
@QuLogic
Copy link
Member

QuLogic commented May 11, 2017

Bah, still wrong, but at least the merge commit (which Travis and AppVeyor test) is correct:

$ git fetch upstream pull/1773/head && git log -1 FETCH_HEAD
From github.com:obspy/obspy
 * branch            refs/pull/1773/head -> FETCH_HEAD
commit 85167d07adc14c823b9a951eb60eeea70ddb98c2
Merge: a203b5a 46ca3c5
Author: Tobias Megies <megies@users.noreply.github.com>
Date:   Thu May 11 14:32:27 2017 +0200

    Merge pull request #1667 from obspy/docker_add_debian_stretch
    
    Debian stretch / Ubuntu Zesty 17.04

$ git fetch upstream pull/1773/merge && git log -1 FETCH_HEAD
From github.com:obspy/obspy
 * branch            refs/pull/1773/merge -> FETCH_HEAD
commit bffbcac2d63eaf981308fef95eedacddc97e976a
Merge: 85167d0 865fd6b
Author: Lijun Zhu <lijunzh@users.noreply.github.com>
Date:   Thu May 11 22:56:03 2017 +0000

    Merge 865fd6b0588bae8c093beab6a32f9a61e236df77 into 85167d07adc14c823b9a951eb60eeea70ddb98c2

@QuLogic
Copy link
Member

QuLogic commented May 12, 2017

I'm still not sure the existing architecture of bytes/str makes sense, but this looks fine for now.

@QuLogic
Copy link
Member

QuLogic commented May 12, 2017

PS, both docker bots and CircleCI are running on the PR head, whereas Travis and AppVeyor are using the PR merge head. I'm not sure how GitHub got out of sync like this, but that's why they're failing.

@lijunzh
Copy link
Contributor Author

lijunzh commented May 12, 2017

Thank @QuLogic for trying to clear the git problem here. Hopefully, we can merge with those red crosses. The code itself is actually good. It passed those tests before I trying to rebase the git changes.

@jkmacc-LANL
Copy link
Contributor

So, if I "dismiss" @krischer's review, will that make merging possible? Is everything else good to go?

@megies
Copy link
Member

megies commented May 12, 2017

Rest of the failing CI is unrelated, so I'm gonna merge this now. Thanks @lijunzh for working on this and uncovering what the actual problem was/is exactly!

@megies megies merged commit ab82a58 into obspy:master May 12, 2017
@krischer
Copy link
Member

So, if I "dismiss" @krischer's review, will that make merging possible? Is everything else good to go?

You can just press "merge" and then use your admin privileges to merge it even if some reviews are not yet accepted.

Sorry for being a bit slow this week.

@lijunzh lijunzh deleted the sac_string_encoding_flag branch May 12, 2017 12:09
@lijunzh
Copy link
Contributor Author

lijunzh commented May 12, 2017

@megies @jkmacc-LANL @krischer @QuLogic Thank you all for helping me through this PR. It was a delightful experience contributing to Obspy community.

@jkmacc-LANL Do you want me to start another PR for actually adding 'encoding' flag to SAC read/write functions or you want to do it yourself?

@megies
Copy link
Member

megies commented May 12, 2017

You can just press "merge" and then use your admin privileges to merge it even if some reviews are not yet accepted.

I can't confirm (obviously), but I think this override switch is only available for members of the "admins" team (which is rather small) but not for the "developers" team (which is huge).. which makes sense kind of..

see https://help.github.com/articles/repository-permission-levels-for-an-organization/

@jkmacc-LANL
Copy link
Contributor

jkmacc-LANL commented May 12, 2017

@megies Correct, "merge" wasn't highlighted, so I think I needed the help. Thanks for managing that.
@lijunzh Sorry this ultimately simple change become rather complicated, but the hard part is over:-) Welcome to the contributors! Thanks for your efforts.

EDIT: I'll do a new PR for adding the write encoding. I think you and @QuLogic are right, storing string headers in a NumPy array that recasts to bytestrings all the time spreads the encoding/decoding logic around too much. I have some ideas about that, and I can follow up. Thanks!

@lijunzh
Copy link
Contributor Author

lijunzh commented May 13, 2017

@jkmacc-LANL Since we only access the header field by field, why not store it in a list during the read/write process which can easily hold a byte or string. In this way, we only need to encode/decode once during read/write process.

I'll do a new PR for adding the write encoding.

Regarding the 'encoding' flag, I don't think we now have it for either read or write. This PR was only a minimal fix for the current unexpected exception. I can work with you to set a uniform read/write 'encoding' flag.

@megies
Copy link
Member

megies commented May 15, 2017

@lijunzh Sorry this ultimately simple change become rather complicated, but the hard part is over:-) Welcome to the contributors! Thanks for your efforts.

I can second that, thanks for your work and.. welcome! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
.io.sac ready for review PRs that are ready to be reviewed to get marked ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SAC reading problem
5 participants