Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

u7z fails to parse a particular archive #2707

Closed
mc-butler opened this issue Dec 27, 2011 · 33 comments
Closed

u7z fails to parse a particular archive #2707

mc-butler opened this issue Dec 27, 2011 · 33 comments
Assignees
Labels
area: vfs Virtual File System support prio: medium Has the potential to affect progress
Milestone

Comments

@mc-butler
Copy link

Important

This issue was migrated from Trac:

Origin https://midnight-commander.org/ticket/2707
Reporter birdie (aros@….com)

Subject.

Note

Original attachments:

@mc-butler
Copy link
Author

Changed by birdie (aros@….com) on Dec 27, 2011 at 13:26 UTC

7z l of this archive

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Dec 27, 2011 at 15:47 UTC (comment 1)

  • Component changed from mc-core to mc-vfs

@mc-butler
Copy link
Author

Changed by and on Mar 5, 2016 at 19:08 UTC (comment 2)

Confirmed,
7z archive can hold entries without datetime info
which hide/skip u7z helper.

Patch attached.

@mc-butler
Copy link
Author

Changed by and on Mar 5, 2016 at 19:10 UTC

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Mar 5, 2016 at 21:13 UTC (comment 3)

@and, in the future, please submit whitespace only patches as separate commits. If you change the whitespace in the same commit, then it's not immediately clear from the diff whether the additional changes also concern the logic, or it's only whitespace, unless special options are used (ignore whitespace or word diff). This makes reviewing the history much more tedious than otherwise.

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Mar 5, 2016 at 21:19 UTC (comment 4)

  • Milestone changed from Future Releases to 4.8.17

I have just created a 4.8.17 milestone. I suggest to use this for all tickets with fresh patches to be reviewed.

We need to make a 4.8.16 at some point this month, so just assigning all newly processed tickets to 4.8.16 is not helpful, as we need to stop at some point. But if they are left for future releases, then one would need to go through them again and see which ones already have good patches on review. Assigning them to 4.8.17 sounds like a good solution to me.

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Apr 30, 2016 at 16:08 UTC (comment 5)

  • Milestone changed from 4.8.17 to 4.8.18

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Sep 22, 2016 at 19:37 UTC (comment 6)

  • Milestone changed from 4.8.18 to 4.8.19

@mc-butler
Copy link
Author

Changed by zaytsev-work (@zyv) on Oct 28, 2016 at 13:10 UTC (comment 7)

  • Status changed from new to accepted
  • Owner set to zaytsev-work

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Dec 18, 2016 at 13:37 UTC (comment 8)

  • Branch state changed from no branch to on review

Branch: 2707_extfs_u7z
Initial changeset: [7f6a700d9e2feea02cde4097d64a7ba522eea7b9]

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Dec 18, 2016 at 13:39 UTC (comment 9)

See also #3730 regarding making a possible test out of this.

@mc-butler
Copy link
Author

Changed by mooffie (@mooffie) on Dec 18, 2016 at 20:13 UTC (comment 10)

Please hold on the merge for one day, ok? I'll have a quick look into the testability of this today.

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Dec 18, 2016 at 20:37 UTC (comment 11)

No worries, it has been waiting for 5 years already, so I have absolutely no problems with waiting a couple of days more.

@mc-butler
Copy link
Author

Changed by mooffie (@mooffie) on Dec 19, 2016 at 1:20 UTC (comment 12)

There's another problem in the archive @birdie posted: the size field is missing for some files.

  • I'm attaching a patch to fix this.
  • The first patch, though, will be to make the code more readable (otherwise you won't understand my fix).
  • Then two other small patches will follow (portability, cleanup).

Don't worry: you won't have to wait 5 more years to commit this. I've already prepared tests (I'll start a new ticket shortly) so you can commit this outright.

@mc-butler
Copy link
Author

Changed by mooffie (@mooffie) on Dec 19, 2016 at 1:21 UTC

@mc-butler
Copy link
Author

Changed by mooffie (@mooffie) on Dec 19, 2016 at 1:22 UTC

@mc-butler
Copy link
Author

Changed by mooffie (@mooffie) on Dec 19, 2016 at 1:30 UTC

@mc-butler
Copy link
Author

Changed by zaytsev-work (@zyv) on Dec 19, 2016 at 8:22 UTC (comment 14)

Could you please give me an example of sed that has problems with \s ? Did you mean macOS?

zaytsevy:~ zaytsev$ echo "abc   def" | sed -e 's/\s*//g'
abc   def
zaytsevy:~ zaytsev$ echo "abc   def" | sed -e 's/[[:space:]]*//g' 
abcdef

I'm not sure that replacing \s with is the best approach, because \s at very least also includes tab, but maybe in this case it doesn't matter and if the input has some other whitespace we have a problem anyways.

@mc-butler
Copy link
Author

Changed by birdie (aros@….com) on Dec 19, 2016 at 9:42 UTC (comment 15)

I can imagine 7z archives which lack both compressed _and_ uncompressed sizes simultaneously. Please consider this.

@mc-butler
Copy link
Author

Changed by mooffie (@mooffie) on Dec 19, 2016 at 15:05 UTC (comment 14.16)

Replying to zaytsev-work:

I'm not sure that replacing \s with ' ' is the best approach, because \s at very least also includes tab [...] and if the input has some other whitespace we [...]


I don't see a problem with replacing \s with just ' '. On the contrary. If the input has whitespace zoo, our whole sed recipe would fail: we're very precise in the rest of the pattern(s).

Could you please give me an example of sed that has problems with \s ?


I'm not familiar with such sed personally. \s is just not standard.

Now, speaking of portability:

There was something that bothered me and I now verified it: the | (bar) I used in \($date_re\|$empty_date_re\) isn't portable either. sed is supposed to support "basic regular expressions" and | is of "extended regular expressions". So it's my mistake.

I'll soon re-upload patch #2 (after I get more info from @birdie).

@mc-butler
Copy link
Author

Changed by mooffie (@mooffie) on Dec 19, 2016 at 15:10 UTC (comment 15.17)

Replying to birdie:

I can imagine 7z archives which lack both compressed _and_ uncompressed sizes simultaneously. Please consider this.


Could you please provide more info? I.e., why would they be missing?

(I'm asking this because sed is a bit limited, as you see, and I don't want to bloat the recipe needlessly.)

@mc-butler
Copy link
Author

Changed by birdie (aros@….com) on Dec 19, 2016 at 16:44 UTC

@mc-butler
Copy link
Author

Changed by birdie (aros@….com) on Dec 19, 2016 at 16:44 UTC (comment 18)

Replying to mooffie:

Could you please provide more info? I.e., why would they be missing?

See listing2.txt - its last file has no sizes at all.

@mc-butler
Copy link
Author

Changed by mooffie (@mooffie) on Dec 20, 2016 at 0:28 UTC

@mc-butler
Copy link
Author

Changed by mooffie (@mooffie) on Dec 20, 2016 at 0:29 UTC

@mc-butler
Copy link
Author

Changed by mooffie (@mooffie) on Dec 20, 2016 at 1:04 UTC (comment 19)

Ok, I updated patch 0002 (so I had to update 0003 too). Remember to commit them in order: 0001, 0002, 0003, 0004.

What's new in patch 0002?

  • It no longer uses | in the regex.
  • It handles the new case @birdie noted where both the uncompressed and compressed sizes are missing (see 'listing2.txt').

Don't worry about tricky bugs: the tests at #3744 were updated. 'u7z.missing-size-and-date.input' tests all the possibilities that patch 0002 needs to handle.

(As for counting 19 spaces in patch 0003: I felt ' *' would be too loose. Whatever. Let's not be fussy about this.)

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Dec 23, 2016 at 18:52 UTC (comment 20)

See listing2.txt - its last file has no sizes at all.

So, the answer is, because it's not a valid 7z archive, but rather 7z is used to explore other ~archive files that it supports, like NSIS installer packages in this case. But okay, fair enough.

I don't see a problem with replacing \s with just ' '. On the contrary. If the input has whitespace zoo, our whole sed recipe would fail: we're very precise in the rest of the pattern(s).

Alright, I was just saying that it's not an equivalent replacement, but you have a valid point.

Don't worry about tricky bugs: the tests at #3744 were updated.

Oh joys of test driven development ;-) I've just pushed out the new patches and Travis build was successful. Unless Andrew has anything to say against it, I'll try to merge it tomorrow.

Thank you mooffie for your help!

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Dec 24, 2016 at 4:56 UTC (comment 20.21)

Replying to zaytsev:

Unless Andrew has anything to say against it

No, I haven't. Everything is fine.

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Dec 24, 2016 at 5:51 UTC (comment 22)

  • Branch state changed from on review to approved
  • Votes set to zaytsev andrew_b

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Dec 24, 2016 at 5:54 UTC (comment 23)

  • Votes changed from zaytsev andrew_b to committed-master
  • Status changed from accepted to testing
  • Resolution set to fixed
  • Branch state changed from approved to merged

Merged to master: [12dbd95].

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Dec 24, 2016 at 5:55 UTC (comment 24)

  • Status changed from testing to closed

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Feb 22, 2025 at 11:50 UTC

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Feb 22, 2025 at 11:50 UTC

  • Blocked by set to #3730

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: vfs Virtual File System support prio: medium Has the potential to affect progress
Development

No branches or pull requests

2 participants