Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Samplebuilder is broken in v3.0.29 #1839

Closed
jech opened this issue Jun 20, 2021 · 11 comments
Closed

Samplebuilder is broken in v3.0.29 #1839

jech opened this issue Jun 20, 2021 · 11 comments

Comments

@jech
Copy link
Member

jech commented Jun 20, 2021

I've just tried upgrading Galene from v3.0.27 to v3.0.29. After fixing the issue in #1838, I'm getting corrupted video when attempting to save to disk, at least for VP8.
CC @robin-raymond

@robin-raymond
Copy link
Contributor

are you seeing PrevDroppedPackets > 0 ever?

@jech
Copy link
Member Author

jech commented Jun 20, 2021

I've only seen it once.

@robin-raymond
Copy link
Contributor

is it always corrupted or sometimes?

@jech
Copy link
Member Author

jech commented Jun 20, 2021

It's systematically corrupted. It takes about 20 seconds before it finds a keyframe (we send regular PLIs until we get a keyframe, so that shouldn't happen), and then there are green blocky artifacts that cover between half and the whole frame. This is on a lossless connection (streaming from localhost).

@robin-raymond
Copy link
Contributor

I think we need to capture RTP packet data for replay as a unit test. I'm unsure why it's breaking down but we need something to demonstrate it failing because it does work in other scenarios. You should see every single time stamp that enters gets popped out. Also the total bytes going to the payloads input should equal the total bytes coming out (up to the same timestamp point).

@jech
Copy link
Member Author

jech commented Jun 20, 2021

I've put a log at https://galene.org/samplebuilder-log.text . It's generated by the following:

diff --git a/diskwriter/diskwriter.go b/diskwriter/diskwriter.go
index 9068a20..97aca0c 100644
--- a/diskwriter/diskwriter.go
+++ b/diskwriter/diskwriter.go
@@ -453,6 +453,10 @@ func (t *diskTrack) Write(buf []byte) (int, error) {
 		return 0, nil
 	}
 
+	if strings.EqualFold(codec.MimeType, "video/vp8") {
+		log.Printf("Packet: %v %v (%v)", p.Timestamp, p.SequenceNumber, len(p.Payload))
+	}
+
 	if strings.EqualFold(codec.MimeType, "video/vp9") {
 		var vp9 codecs.VP9Packet
 		_, err := vp9.Unmarshal(p.Payload)
@@ -478,6 +482,10 @@ func (t *diskTrack) Write(buf []byte) (int, error) {
 			return len(buf), nil
 		}
 
+		if strings.EqualFold(codec.MimeType, "video/vp8") {
+			log.Printf("Sample: %v (%v)", ts, len(sample.Data))
+		}
+
 		keyframe := true
 
 		if strings.EqualFold(codec.MimeType, "video/vp8") ||

@robin-raymond
Copy link
Contributor

sorted.xlsx

I sorted and did some comparisons. Please double check but it looks like every packet timestamp in is getting combined into a single sample payload out at that timestamp. The total size of the sample is the total of the packets minus 4 bytes per packet (which your depacketizer must be stripping out of the packet payload). Without doing some kind of hash on the input/output data going in/out, the values in/out look correct. So unless the depacketizer is corrupting the packets before becoming samples, I don't see anything in/out of this routine that's incorrect. Do you see something that I missed?

@robin-raymond
Copy link
Contributor

The other thing to check is that in your text file sequence numbers coming out are always increasing and never jumping around. Obviously if the samples coming out are jumbled in the wrong order that would cause issues.

@jech
Copy link
Member Author

jech commented Jul 7, 2021

@robin-raymond it looks like the code is broken, please see #1873.

@Sean-Der would you consider reverting the samplebuilder to v3.0.27 until the new code is fixed? v3 is supposed to be stable.

@jech
Copy link
Member Author

jech commented Jul 11, 2021

Since the samplebuilder has been broken for over a month now, Galene is switching to its own samplebuilder. Please see https://github.org/jech/samplebuilder.

@Sean-Der, please let me know whether you want me to submit it for inclusion, or whether I should keep working outside of Pion.

@Sean-Der
Copy link
Member

Sean-Der commented Aug 1, 2021

Fixed with 36cf395

@Sean-Der Sean-Der closed this as completed Aug 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants