-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: invalid literal for int() with base 10: in clean_osm_data #153
Comments
HI @oayana, We splitted at some point one line that had voltage information in this format {110,220) to two lines 110 and 220. So probably we missed splitting the cable as well. I think we need an alternative solution compared to the workaround. Do you want to create a PR and think about a solution? |
First of all, you're welcome :) While I was performing the analysis for Turkey, I got such an error on line 368 of the osm_data_cleaning script. I tried a workaround as a solution. But if you want to find a better solution, of course I would like to help. |
Hi @oayana, A cable can be only an integer (1,2,3,...). The line with the trouble has 2 lines in parallel running. One with 2 cables one with 3 cables. This indeed lead to the error that (2;3) cannot be converted to an integer. As you can see below, we splitted voltages because there was also a semicolon. What the docstring explains is that we separate the voltage in this function and create an identical line with all previous data. So while we have now two lines with separated voltage without semicolon, we still have the problem that the cable information is not fixed - it has now in each line smt like Line1['cable'] -> (2;3), Line2 ['cable'] -> (2:3) Solution:
Goal (simplified):
Do you want to be on the contributor list with such a fix? :) |
I understood the problem and would like to offer you a solution. I will handle it as soon as possible and get back to you :) |
Hi Max, You have already produced the solution for this. It is solved with the split_cells function, as you threw into the voltages values.
Then, |
Hi @oayana, I have only one doubt which we need to check to be ready for an accepted PR. Let's assume one line has a semicolon separated values for voltage and cable at the same time:
Then, I believe applying
Instead of the desired solution which I mentioned above (#153 (comment)). I would suggest writing a small jupyterscript to test if a one line with the semicolon in both cases can be solved by the split cell function correctly. Almost there :) |
Hi Max,
|
Some cables values (in df_all_lines["cables"]) are not suitable for conversion to int values (eg :3;6).
Workaround: df_all_lines["cables"] = df_all_lines["cables"].astype(str).str.replace(";", ".").astype(float).astype(int)
You can use.
The text was updated successfully, but these errors were encountered: