-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Add OutputType parameter to Import-Csv #8862
Comments
This really got a lot of traction :) |
Only just saw this now - seems well worth doing. |
Just came across https://stackoverflow.com/q/58660818/45375, where an out-of-memory exception occurred even in a streaming scenario; that is, the objects weren't even collected in full in memory and instead just piped back to Is the problem in this case one of mounting memory pressure due to lack of garbage collections? Would it make sense to build periodic garbage collection into the command?
|
https://stackoverflow.com/a/60356120/45375 may give this issue a bit more exposure.
|
Shouldn't this be done (or also possible) via a calculated property, like: $e = Import-Csv -Path .\data.csv -Property
@{Name = 'Name'; Type = [string]},
@{Name = 'Number'; Type = [long]},
@{Name = 'When'; Type = [DateTime]}
} Where the default type is a I guess that a calculated At second thought, I think this isn't possible as the property types of a |
Yet another thought to consider: |
Interesting ideas, @iRon7, but they are complementary to what is being proposed here, so I encourage you to create new issues:
Also, something that probably fits better into the context of this issue and the associated PR (#8860) in terms of implementation, is what @bergmeister has suggested before (emphasis added):
|
Despite my own Instead of: $e = Import-Csv -Path .\data.csv
$e | Foreach-Object { <process your item> } | <output your results> Keep your CSV data as it is: $e = Get-Content -Path .\data.csv
$e | ConvertFrom-Csv | Foreach-Object { <process your item> } | <output your results> |
This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you. |
2 similar comments
This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you. |
This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you. |
This issue has been marked as "No Activity" as there has been no activity for 6 months. It has been closed for housekeeping purposes. |
Summary of the new feature/enhancement
By using a concrete type, instead of
PSObject
, both import speed and memory usage can be improved.As it is today, Import-Csv is almost useless for larger datasets, since the overhead of our NoteProperties is 48 bytes, not counting the name and the value. When the imported values are integers, that is a blowup-factor of ~20.
These numbers are from my prototype:
By keeping it as strings, the import speed is vastly improved. By converting to integers, the speed is still improved, and the memory requirements are vastly improved.
Proposed technical implementation details (optional)
See #8860.
The gist is to generate expression trees, that sets the properties or call the constructor, on an instance of the provided type.
The use of the constructor allows for custom type conversion, where there are no language conversions from string to the property type.
The type needs to have members that match the names of the columns in the CSV.
Maybe we should provide a way of providing alternate headers to map to existing objects?
Data.csv
I also implemented ctor calls, that takes precedence.
I would like to see a discussion about the feature set, error handling, names for parameters etc.
The text was updated successfully, but these errors were encountered: