Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the metadata collection function #13

Closed
gutzbenj opened this issue Sep 11, 2019 · 0 comments
Closed

Improve the metadata collection function #13

gutzbenj opened this issue Sep 11, 2019 · 0 comments
Labels
enhancement New feature or request

Comments

@gutzbenj
Copy link
Member

gutzbenj commented Sep 11, 2019

The German Weather Service has stored his metadata in a poor format, which has no seperators (neither comma nor tab). Instead of reading the data like we currently do it (rowwise reading and checking of number of chunks, then processing it with respect to this chunk number),
`

from fix_metaindex

metaindex_to_fix = metaindex.iloc[:, 6:]

# Reduce the original dataframe by those columns
metaindex = metaindex.iloc[:, :6]

# Index is fixed by string operations (put together all except the last
# string which refers to state)
metaindex_to_fix = metaindex_to_fix \
                       .agg(lambda data: [string
                                          for string in data
                                          if string is not None], 1) \
                       .to_frame() \
                       .iloc[:, 0] \
    .agg(lambda data: [' '.join(data[:-1]), data[-1]]) \
    .apply(pd.Series)

# Finally put together again the original frame and the fixed data
metaindex = pd.concat([metaindex, metaindex_to_fix], axis=1)

`

we should find another way, where we can read it directly with pandas with some predefinitions like read_fwf (fixed width formatted). This should give as a performance boost and also fix some other issues with string handling.

@gutzbenj gutzbenj added the enhancement New feature or request label Sep 11, 2019
@gutzbenj gutzbenj changed the title Improving the metadata collection function Improve the metadata collection function Sep 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant