Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] for testing : Fix 4Gb limit for large files on Git for Windows #2179

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 39 additions & 39 deletions apply.c
Original file line number Diff line number Diff line change
Expand Up @@ -177,9 +177,9 @@ static void set_default_whitespace_mode(struct apply_state *state)
* of context lines.
*/
struct fragment {
unsigned long leading, trailing;
unsigned long oldpos, oldlines;
unsigned long newpos, newlines;
size_t leading, trailing;
size_t oldpos, oldlines;
size_t newpos, newlines;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that we can, and should, break apart this huge patch. In other words, I don't believe the statement that this is the minimal patch ;-)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minimal patch

Can I just say that the Github review interface is really poor here. I have no idea (from the web page) which patch is which, which belongs to which commit etc.

Assuming this was Torsten's original patch it did fit on the mailing list as a 'less than sufficient' patch https://public-inbox.org/git/20190413151850.29037-1-tboegi@web.de/ (it's how I applied it, from an email on-list;-).

If the 'minimal' comment is about something else, excuse my mistake.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea (from the web page) which patch is which, which belongs to which commit etc.

That's an optimization in the web UI, with a very easy way out: click on the file name above the diff, i.e. https://github.com/git-for-windows/git/pull/2179/files/5b60f72b34f213309d5870b7df4b1038914e4fc0..9c2f8de134713f7e04fc2402567c5b0c29605737#diff-3edc96d0eefd86960d342f661171a62c

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this was Torsten's original patch it did fit on the mailing list as a 'less than sufficient' patch

It changes 85 files. Eighty-five! Nobody can review such a patch properly, not @tboegi, not you, not me.

And when I look at the diffstat, I immediately see e.g. that there are plenty of changes in apply.c and plenty in builtin/pack-objects.c. Those two files share preciously little concern. So I am rather certain that it would be easy even to split this at the file boundary.

Which still does not make sense, as it is still not a logically-minimal patch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to split this at the file boundary.

Sorry, I meant at that file boundary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And indeed, if I apply only the builtin/pack-objects.c part, for example, it compiles Just Fine. So the commit message is maybe not quite truthful when it says

The smallest step seems to be much bigger than expected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or when it says this:

However, when the Git code base is compiled with a compiler that complains that "unsigned long" is different from size_t, we end up in this huge patch, before the code base cleanly compiles.

As I just proved, there is a quite a bit smaller patch that already compiles cleanly: just the 70 changed lines of builtin/pack-objects.c. And I bet that we can do more of the same to cut this into nice mouthfuls, without breaking the build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PhilipOakley I'd like to focus your attention on the two comments above this one.

/*
* 'patch' is usually borrowed from buf in apply_patch(),
* but some codepaths store an allocated buffer.
Expand Down Expand Up @@ -423,9 +423,9 @@ static int read_patch_file(struct strbuf *sb, int fd)
return 0;
}

static unsigned long linelen(const char *buffer, unsigned long size)
static size_t linelen(const char *buffer, size_t size)
{
unsigned long len = 0;
size_t len = 0;
while (size--) {
len++;
if (*buffer++ == '\n')
Expand Down Expand Up @@ -1321,7 +1321,7 @@ static int parse_git_header(struct apply_state *state,
unsigned int size,
struct patch *patch)
{
unsigned long offset;
size_t offset;

/* A git diff has explicit new/delete information, so we don't guess */
patch->is_new = 0;
Expand Down Expand Up @@ -1391,7 +1391,7 @@ static int parse_git_header(struct apply_state *state,
return offset;
}

static int parse_num(const char *line, unsigned long *p)
static int parse_num(const char *line, size_t *p)
{
char *ptr;

Expand All @@ -1402,7 +1402,7 @@ static int parse_num(const char *line, unsigned long *p)
}

static int parse_range(const char *line, int len, int offset, const char *expect,
unsigned long *p1, unsigned long *p2)
size_t *p1, size_t *p2)
{
int digits, ex;

Expand Down Expand Up @@ -1517,19 +1517,19 @@ static int parse_fragment_header(const char *line, int len, struct fragment *fra
*/
static int find_header(struct apply_state *state,
const char *line,
unsigned long size,
size_t size,
int *hdrsize,
struct patch *patch)
{
unsigned long offset, len;
size_t offset, len;

patch->is_toplevel_relative = 0;
patch->is_rename = patch->is_copy = 0;
patch->is_new = patch->is_delete = -1;
patch->old_mode = patch->new_mode = 0;
patch->old_name = patch->new_name = NULL;
for (offset = 0; size > 0; offset += len, size -= len, line += len, state->linenr++) {
unsigned long nextlen;
size_t nextlen;

len = linelen(line, size);
if (!len)
Expand Down Expand Up @@ -1667,14 +1667,14 @@ static void check_old_for_crlf(struct patch *patch, const char *line, int len)
*/
static int parse_fragment(struct apply_state *state,
const char *line,
unsigned long size,
size_t size,
struct patch *patch,
struct fragment *fragment)
{
int added, deleted;
int len = linelen(line, size), offset;
unsigned long oldlines, newlines;
unsigned long leading, trailing;
size_t oldlines, newlines;
size_t leading, trailing;

offset = parse_fragment_header(line, len, fragment);
if (offset < 0)
Expand Down Expand Up @@ -1789,11 +1789,11 @@ static int parse_fragment(struct apply_state *state,
*/
static int parse_single_patch(struct apply_state *state,
const char *line,
unsigned long size,
size_t size,
struct patch *patch)
{
unsigned long offset = 0;
unsigned long oldlines = 0, newlines = 0, context = 0;
size_t offset = 0;
size_t oldlines = 0, newlines = 0, context = 0;
struct fragment **fragp = &patch->fragments;

while (size > 4 && !memcmp(line, "@@ -", 4)) {
Expand Down Expand Up @@ -1864,8 +1864,8 @@ static inline int metadata_changes(struct patch *patch)
patch->old_mode != patch->new_mode);
}

static char *inflate_it(const void *data, unsigned long size,
unsigned long inflated_size)
static char *inflate_it(const void *data, size_t size,
size_t inflated_size)
{
git_zstream stream;
void *out;
Expand Down Expand Up @@ -1894,7 +1894,7 @@ static char *inflate_it(const void *data, unsigned long size,
*/
static struct fragment *parse_binary_hunk(struct apply_state *state,
char **buf_p,
unsigned long *sz_p,
size_t *sz_p,
int *status_p,
int *used_p)
{
Expand All @@ -1911,10 +1911,10 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
* to 1-26 bytes, and 'a'-'z' corresponds to 27-52 bytes.
*/
int llen, used;
unsigned long size = *sz_p;
size_t size = *sz_p;
char *buffer = *buf_p;
int patch_method;
unsigned long origlen;
size_t origlen;
char *data = NULL;
int hunk_size = 0;
struct fragment *frag;
Expand Down Expand Up @@ -2006,7 +2006,7 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
*/
static int parse_binary(struct apply_state *state,
char *buffer,
unsigned long size,
size_t size,
struct patch *patch)
{
/*
Expand Down Expand Up @@ -2123,7 +2123,7 @@ static int use_patch(struct apply_state *state, struct patch *p)
* the number of bytes consumed otherwise,
* so that the caller can call us again for the next patch.
*/
static int parse_chunk(struct apply_state *state, char *buffer, unsigned long size, struct patch *patch)
static int parse_chunk(struct apply_state *state, char *buffer, size_t size, struct patch *patch)
{
int hdrsize, patchsize;
int offset = find_header(state, buffer, size, &hdrsize, patch);
Expand Down Expand Up @@ -2153,7 +2153,7 @@ static int parse_chunk(struct apply_state *state, char *buffer, unsigned long si
if (!patchsize) {
static const char git_binary[] = "GIT binary patch\n";
int hd = hdrsize + offset;
unsigned long llen = linelen(buffer + hd, size - hd);
size_t llen = linelen(buffer + hd, size - hd);

if (llen == sizeof(git_binary) - 1 &&
!memcmp(git_binary, buffer + hd, llen)) {
Expand Down Expand Up @@ -2397,7 +2397,7 @@ static void update_pre_post_images(struct image *preimage,
static int line_by_line_fuzzy_match(struct image *img,
struct image *preimage,
struct image *postimage,
unsigned long current,
size_t current,
int current_lno,
int preimage_limit)
{
Expand Down Expand Up @@ -2466,7 +2466,7 @@ static int match_fragment(struct apply_state *state,
struct image *img,
struct image *preimage,
struct image *postimage,
unsigned long current,
size_t current,
int current_lno,
unsigned ws_rule,
int match_beginning, int match_end)
Expand Down Expand Up @@ -2677,7 +2677,7 @@ static int find_pos(struct apply_state *state,
int match_beginning, int match_end)
{
int i;
unsigned long backwards, forwards, current;
size_t backwards, forwards, current;
int backwards_lno, forwards_lno, current_lno;

/*
Expand Down Expand Up @@ -2861,7 +2861,7 @@ static int apply_one_fragment(struct apply_state *state,
int new_blank_lines_at_end = 0;
int found_new_blank_lines_at_end = 0;
int hunk_linenr = frag->linenr;
unsigned long leading, trailing;
size_t leading, trailing;
int pos, applied_pos;
struct image preimage;
struct image postimage;
Expand Down Expand Up @@ -3085,9 +3085,9 @@ static int apply_one_fragment(struct apply_state *state,
*/
if ((leading != frag->leading ||
trailing != frag->trailing) && state->apply_verbosity > verbosity_silent)
fprintf_ln(stderr, _("Context reduced to (%ld/%ld)"
fprintf_ln(stderr, _("Context reduced to (%"PRIuMAX"/%"PRIuMAX")"
" to apply fragment at %d"),
leading, trailing, applied_pos+1);
(uintmax_t)leading, (uintmax_t)trailing, applied_pos+1);
update_image(state, img, applied_pos, &preimage, &postimage);
} else {
if (state->apply_verbosity > verbosity_normal)
Expand All @@ -3109,7 +3109,7 @@ static int apply_binary_fragment(struct apply_state *state,
struct patch *patch)
{
struct fragment *fragment = patch->fragments;
unsigned long len;
size_t len;
void *dst;

if (!fragment)
Expand Down Expand Up @@ -3199,7 +3199,7 @@ static int apply_binary(struct apply_state *state,
if (has_object_file(&oid)) {
/* We already have the postimage */
enum object_type type;
unsigned long size;
size_t size;
char *result;

result = read_object_file(&oid, &type, &size);
Expand Down Expand Up @@ -3244,7 +3244,7 @@ static int apply_fragments(struct apply_state *state, struct image *img, struct
while (frag) {
nth++;
if (apply_one_fragment(state, img, frag, inaccurate_eof, ws_rule, nth)) {
error(_("patch failed: %s:%ld"), name, frag->oldpos);
error(_("patch failed: %s:%"PRIuMAX), name, (uintmax_t)frag->oldpos);
if (!state->apply_with_reject)
return -1;
frag->rejected = 1;
Expand All @@ -3261,7 +3261,7 @@ static int read_blob_object(struct strbuf *buf, const struct object_id *oid, uns
strbuf_addf(buf, "Subproject commit %s\n", oid_to_hex(oid));
} else {
enum object_type type;
unsigned long sz;
size_t sz;
char *result;

result = read_object_file(oid, &type, &sz);
Expand Down Expand Up @@ -4290,7 +4290,7 @@ static int add_index_file(struct apply_state *state,
const char *path,
unsigned mode,
void *buf,
unsigned long size)
size_t size)
{
struct stat st;
struct cache_entry *ce;
Expand Down Expand Up @@ -4344,7 +4344,7 @@ static int add_index_file(struct apply_state *state,
*/
static int try_create_file(struct apply_state *state, const char *path,
unsigned int mode, const char *buf,
unsigned long size)
size_t size)
{
int fd, res;
struct strbuf nbuf = STRBUF_INIT;
Expand Down Expand Up @@ -4395,7 +4395,7 @@ static int create_one_file(struct apply_state *state,
char *path,
unsigned mode,
const char *buf,
unsigned long size)
size_t size)
{
int res;

Expand Down Expand Up @@ -4487,7 +4487,7 @@ static int create_file(struct apply_state *state, struct patch *patch)
{
char *path = patch->new_name;
unsigned mode = patch->new_mode;
unsigned long size = patch->resultsize;
size_t size = patch->resultsize;
char *buf = patch->result;

if (!mode)
Expand Down
18 changes: 9 additions & 9 deletions archive-tar.c
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
#define BLOCKSIZE (RECORDSIZE * 20)

static char block[BLOCKSIZE];
static unsigned long offset;
static size_t offset;

static int tar_umask = 002;

Expand Down Expand Up @@ -63,12 +63,12 @@ static void write_if_needed(void)
* queues up writes, so that all our write(2) calls write exactly one
* full block; pads writes to RECORDSIZE
*/
static void do_write_blocked(const void *data, unsigned long size)
static void do_write_blocked(const void *data, size_t size)
{
const char *buf = data;

if (offset) {
unsigned long chunk = BLOCKSIZE - offset;
size_t chunk = BLOCKSIZE - offset;
if (size < chunk)
chunk = size;
memcpy(block + offset, buf, chunk);
Expand All @@ -90,7 +90,7 @@ static void do_write_blocked(const void *data, unsigned long size)

static void finish_record(void)
{
unsigned long tail;
size_t tail;
tail = offset % RECORDSIZE;
if (tail) {
memset(block + offset, 0, RECORDSIZE - tail);
Expand All @@ -99,7 +99,7 @@ static void finish_record(void)
write_if_needed();
}

static void write_blocked(const void *data, unsigned long size)
static void write_blocked(const void *data, size_t size)
{
do_write_blocked(data, size);
finish_record();
Expand Down Expand Up @@ -128,7 +128,7 @@ static int stream_blocked(const struct object_id *oid)
{
struct git_istream *st;
enum object_type type;
unsigned long sz;
size_t sz;
char buf[BLOCKSIZE];
ssize_t readlen;

Expand Down Expand Up @@ -211,7 +211,7 @@ static size_t get_path_prefix(const char *path, size_t pathlen, size_t maxlen)

static void prepare_header(struct archiver_args *args,
struct ustar_header *header,
unsigned int mode, unsigned long size)
unsigned int mode, size_t size)
{
xsnprintf(header->mode, sizeof(header->mode), "%07o", mode & 07777);
xsnprintf(header->size, sizeof(header->size), "%011"PRIoMAX , S_ISREG(mode) ? (uintmax_t)size : (uintmax_t)0);
Expand All @@ -232,7 +232,7 @@ static void prepare_header(struct archiver_args *args,

static void write_extended_header(struct archiver_args *args,
const struct object_id *oid,
const void *buffer, unsigned long size)
const void *buffer, size_t size)
{
struct ustar_header header;
unsigned int mode;
Expand All @@ -253,7 +253,7 @@ static int write_tar_entry(struct archiver_args *args,
struct ustar_header header;
struct strbuf ext_header = STRBUF_INIT;
unsigned int old_mode = mode;
unsigned long size, size_in_header;
size_t size, size_in_header;
void *buffer;
int err = 0;

Expand Down
2 changes: 1 addition & 1 deletion archive-zip.c
Original file line number Diff line number Diff line change
Expand Up @@ -297,7 +297,7 @@ static int write_zip_entry(struct archiver_args *args,
void *buffer;
struct git_istream *stream = NULL;
unsigned long flags = 0;
unsigned long size;
size_t size;
int is_binary = -1;
const char *path_without_prefix = path + args->baselen;
unsigned int creator_version = 0;
Expand Down
2 changes: 1 addition & 1 deletion archive.c
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ static void format_subst(const struct commit *commit,
void *object_file_to_archive(const struct archiver_args *args,
const char *path, const struct object_id *oid,
unsigned int mode, enum object_type *type,
unsigned long *sizep)
size_t *sizep)
{
void *buffer;
const struct commit *commit = args->convert ? args->commit : NULL;
Expand Down
6 changes: 3 additions & 3 deletions archive.h
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,8 @@ typedef int (*write_archive_entry_fn_t)(struct archiver_args *args,

int write_archive_entries(struct archiver_args *args, write_archive_entry_fn_t write_entry);
void *object_file_to_archive(const struct archiver_args *args,
const char *path, const struct object_id *oid,
unsigned int mode, enum object_type *type,
unsigned long *sizep);
const char *path, const struct object_id *oid,
unsigned int mode, enum object_type *type,
size_t *sizep);

#endif /* ARCHIVE_H */
Loading