Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traking progress via Progress Bar #37

Closed
ambujpunn opened this issue Aug 1, 2018 · 6 comments
Closed

Traking progress via Progress Bar #37

ambujpunn opened this issue Aug 1, 2018 · 6 comments

Comments

@ambujpunn
Copy link

Is there a good way to track the present progress of the importing as the importing is happening? Right now, it is only possible to see the number of lines present but in order for a UIProgressView to be added it needs an end result so a total sum of lines. In that case, we could simply divide the current number of lines with the total number of lines. However, I understand that CSVImporter is importing the file one line at a time making it hard for us to get the total until the end of the import. Is there any workaround for this?

@Jeehut
Copy link
Member

Jeehut commented Aug 2, 2018

When working with (potentially) large text files it's a good idea to read the file line-by-line to have all the benefits of it (mentioned in the README). Though if you know your file isn't too big to handle or if you don't care, then you can always use this initializer to create a CSVImporter object using a String. To do this you would need to read the contents of the CSV file by yourself. This way you have a String object which you can use to get the total number of lines. The code could look something like this:

let contentString = try! String(contentsOfFile: "path/to/your/file.csv")
let totalLinesCount = contentString.components(separatedBy: CharacterSet.newlines).count
let importer = CSVImporter<[String: String]>(contentString: contentString)

You can also see this example in the tests here.

The above code is a workaround though and might not perfectly work depending on the line ending of your file. As you can see here we already have the lines somewhere within CSVImporter, but it's not public, so you can't read it.

I think to add official support for the total number of lines we could add a public computed property which returns an Optional to CSVImporter which could look like this:

public var totalDataLinesCount: Int? {
    guard case let stringSource = source as? StringSource else { return nil }
    return stringSource?.lines.count
}

It would only work, if you initialize CSVImporter with a String, but it would make sure you don't get into trouble with line endings.

@ambujpunn Would you like to add this feature with test and send a PR? 😃

@ambujpunn
Copy link
Author

@Dschee Wouldn't this only work for when loading an entire csv file into a huge string? Ideally, we'd like to continue and extend the awesome behavior of CSVImporter which is to read line by line rather than store it first somewhere

@Jeehut
Copy link
Member

Jeehut commented Aug 6, 2018

Well, there's a logical problem there though, isn't it? I mean, if you wanna read a file "line by line" then you can't know how many lines the file has since you haven't read the entire file yet, no? What you could do is guess the total number of lines based on the file size. But as this is not accurate by any means, I tend not to include such a feature into CSVImporter. It's gonna result in this.

If you have any other idea of how we could do this, then please, explain and I'll consider adding it.

@loukrieg
Copy link

loukrieg commented Aug 6, 2018

Just a suggestion, but perhaps a separate API could be added that would iterate through the file in chunks, so everything wouldn't need to be in memory at once, just counting the line endings (not within quoted strings).

@Jeehut
Copy link
Member

Jeehut commented Aug 7, 2018

Yeah, that could be possible. But it would still mean that the file is traversed twice, once for checking the total number of lines and once for actually processing the data. Of course, in some cases this might not be a problem, so as long as documentation is very clear on the performance drawback, I'd be happy to merge this feature into CSVImporter. Any volunteers? Cause I won't much time the coming months, maybe sometime in December ...

@Jeehut
Copy link
Member

Jeehut commented Sep 28, 2018

I'm closing this feature as not many people seemed to be interested in it and there's a workaround available by checking the file manually. Feel free to post a PR if you want this feature and are ready to implement yourself.

@Jeehut Jeehut closed this as completed Sep 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants