Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encodeByNameSansHeader: like encodeByName but don't write the header. Us... #41

Closed
wants to merge 1 commit into from

Conversation

alang9
Copy link
Contributor

@alang9 alang9 commented Aug 6, 2013

...e it only for ordering

It seems difficult to encode a vector of "named" values which is too big to fit into memory without this.

@tibbe
Copy link
Collaborator

tibbe commented Aug 6, 2013

I don't understand what you mean by:

It seems difficult to encode a vector of "named" values which is too big to fit into memory without this.

Writing the header seems to take negligible amount of memory compared to holding the actual data in memory. Could you please clarify.

I'm wary about adding any more top-level encode methods, as we will end up with a combinatorial explosion. If this makes sense then we should add it as a EncodeOptions field I think.

@alang9
Copy link
Contributor Author

alang9 commented Aug 6, 2013

Sorry I wasn't very clear. Basically I mean that I might want to append to a csv file which already has its header written at the top.

I agree with making this an EncodeOptions field instead. I'll submit a patch if you think this makes sense.

@tibbe
Copy link
Collaborator

tibbe commented Aug 6, 2013

If we made the encode family of functions use constant space (i.e. by having them take a [a] instead of a Vector a) would that solve your memory issue?

@alang9
Copy link
Contributor Author

alang9 commented Aug 6, 2013

Yes that would probably work.

@tibbe
Copy link
Collaborator

tibbe commented Aug 6, 2013

Just out of curiosity, how much data are you trying to write?

@alang9
Copy link
Contributor Author

alang9 commented Aug 6, 2013

About 5 million rows by 100 columns. I guess I would be able to fit all of it in memory, bit it still seems like something you should be able to do without using that much memory.

@tibbe
Copy link
Collaborator

tibbe commented Aug 6, 2013

I went with the constant space encoding route instead. Fixed in #42.

@tibbe tibbe closed this Aug 6, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants