-
Notifications
You must be signed in to change notification settings - Fork 262
1.修复文件尾行断行问题;2.修复origin不正确问题 #1181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1dec318 to
eebb621
Compare
1474111 to
038aa43
Compare
| // Read new data: try a limited number of times. | ||
| for i := maxConsecutiveEmptyReads; i > 0; i-- { | ||
| n, err := b.rd.Read(b.buf[b.w:]) | ||
| if n < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err 不为nil的时候依然需要b.w += n
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err 不为nil的时候依然需要b.w += n
done
| } | ||
|
|
||
| } | ||
| b.err = io.ErrNoProgress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里把ErrNoProgress去掉,在上面readSlice里面就不会返回err,会一直for循环
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
上面已经有了b.err = err,所以这里可以忽略
reader/bufreader/bufreader.go
Outdated
| SIdx := rc.NewSourceIndex() | ||
| for _, v := range SIdx { | ||
| // 从 NewSourceIndex 函数中返回的index值就是本次读取的批次中上一个DataSource的数据量,加上b.w就是上个DataSource的整体数据 | ||
| if len(SIdx) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里没必要加这个判断,SIdx一定会>0。否则就出bug了
|
我感觉这个问题的根源在于什么是文件末尾,seqfile里面定义文件读取到eof就是文件末尾,但是对于实时读取的场景来说我觉得不严谨,如果读到eof的瞬间还有一部分数据没有写入file,那就会加/n,并且上面所做的一些都是对这个场景的修复。 我觉得文件expire的时候才是文件末尾,但是我们无法提前知道在expire之前eof之后是否还有数据写入,所以没有办法判断是否要加\n 所以我建议分两种情况:
|
暂时不按照上面说的逻辑做大的重构了。 |
d0588fc to
2db45ad
Compare
|
lgtm |
| if len(dr.readcache) == 0 && dr.halfLineCache[source] == "" { | ||
| if key, exist := utils.GetKeyOfNotEmptyValueInMap(dr.halfLineCache); exist { | ||
| source = key | ||
| // 大约一小时没读到内容,设置为 inactive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个为什么是1小时
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个为什么是1小时
1小时是原先的逻辑,这里只是代码换个位置
| dr.readLock.Unlock() | ||
| } | ||
|
|
||
| if cache, ok := dr.halfLineCache[source]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/qiniu/logkit/pull/1181/files#diff-d48b01cd98a25cbcc145cb6eda975cc2d56e6ef77d5c08fe4776e966bde20accL116 我看原逻辑判断大小了,这边需要保持原有逻辑吗?
防止没有换行符导致一直累积数据,加了大小限制
| dr.readcache = "" | ||
| continue | ||
| } else { | ||
| delete(dr.halfLineCache, source) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这边都不需要使用 readLock 吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这边都不需要使用 readLock 吗
需要的,已加
reader/seqfile/seqfile.go
Outdated
| n += n1 | ||
| if n1 > 0 { | ||
| eofTimes = 0 | ||
| if len(sf.newLineBytesSourceIndex) != 0 && sf.newLineBytesSourceIndex[len(sf.newLineBytesSourceIndex)-1].Source == sf.currFile { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(sf.newLineBytesSourceIndex)搞个变量吧,方便阅读
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(sf.newLineBytesSourceIndex)搞个变量吧,方便阅读
done
| return | ||
| if err == io.EOF { | ||
| time.Sleep(time.Millisecond * 10) | ||
| eofTimes++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这边的逻辑要实测一下
|
改动比较多,需要针对 single/tailx/dir/dirx 都自己测一下正常读取和文件尾断行情况,再测一下性能,和之前的比一下 |
| dr.readcache = cache + dr.readcache | ||
| dr.readLock.Unlock() | ||
| } | ||
| if !strings.HasSuffix(dr.readcache, string(dr.br.GetDelimiter())) && dr.numEmptyLines < 3 && len(dr.readcache) < 20*MB { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(dr.readcache) >= 20*MB 需要打个日志吗,方便排查?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(dr.readcache) >= 20*MB 需要打个日志吗,方便排查?
done
82da3dc to
efc7caa
Compare
使用情况在客户环境验证过,性能测试需要测试协助 |
客户环境 single/tailx/dir/dirx 这几种情况都验证过吗?comment改了之后还要再实测一下各种场景,防止修复了一个问题,引入更多问题 |
|
比较担心测试的case是否全面。其他 lgtm |
Fixes [issue number]
Changes
Reviewers
Wiki Changes
Checklist