Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimension must also be inferred from headers #1509

Merged
merged 6 commits into from
Feb 23, 2022

Conversation

abdelr
Copy link
Member

@abdelr abdelr commented Feb 22, 2022

Fixes #1427 (Dimension should also be inferred from headers)

@abdelr abdelr added the Importer-Redesign Redesign of observed data import label Feb 22, 2022
@abdelr abdelr self-assigned this Feb 22, 2022
@@ -515,6 +515,18 @@ public ToolTipDescription ToolTipDescriptionFor(int index)
};
}

private void setUnitAndDimension(ColumnMappingDTO model)
{
var supportedDimensions = _columnInfos.First(x => x.DisplayName == model.MappingName).SupportedDimensions;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that I have seens this code
_columnInfos.First(x => x.DisplayName == model.MappingName e.g. finding a column by display name so many times, that I am believe it needs to be refactored.
What about you create cache with DisplayName keys instead of IREadOnlyList. That way you can simply do columnInfos[model.MappingName].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are indeed several places on tests. On code is only here and three on the DataImporterTask. Refactoring to a cache for this might be unnecessary since the _columnInfos is used in several places in the code and test code should not drive our development. Besides, it is also used as _columnInfos.First(ci => !(ci.IsAuxiliary() || ci.IsBase())) which will not benefit from a cache. I would think on creating maybe some extension methods if it keeps popping up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache is a ienumerable by default. It will be the same code everywhere except where you can simplify it drastically

{
var supportedDimensions = _columnInfos.First(x => x.DisplayName == model.MappingName).SupportedDimensions;
var unit = _format.ExtractUnitDescriptions(model.ExcelColumn, supportedDimensions);
if (unit.SelectedUnit != UnitDescription.InvalidUnit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if equal, return. Exit early

var unit = _format.ExtractUnitDescriptions(model.ExcelColumn, supportedDimensions);
if (unit.SelectedUnit != UnitDescription.InvalidUnit)
{
var mappingDataFormatParameter = (model.Source as MappingDataFormatParameter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure this is a MappingDataFormatParameter. If yes, explain why as comment
I am seeing this type of casting with as also a lot in the code. This is very unusual to always know what to cast to and I believe there is a need for refactoring here as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(but not for this PR...it's a general observation)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. We definitely need a refactoring. The thing is DataFormatParameters have a lot of common use but some special features are concrete for mappings (time, measurement and error). I would say there could be a refactoring missing but I would like to postpone it for now.

(_model.Source as MappingDataFormatParameter).MappedColumn.Unit.SelectedUnit.ShouldBeEqualTo(shouldUpdate ? newUnitDescription : oldUnitDescription);
var mappingDataFormat = (_model.Source as MappingDataFormatParameter);
mappingDataFormat.MappedColumn.Unit.SelectedUnit.ShouldBeEqualTo(shouldUpdate ? newUnitDescription : oldUnitDescription);
if (shouldUpdate && newUnitDescription != "?")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's with the ? Don't we have a constant for it

var mappingDataFormat = (_model.Source as MappingDataFormatParameter);
mappingDataFormat.MappedColumn.Unit.SelectedUnit.ShouldBeEqualTo(shouldUpdate ? newUnitDescription : oldUnitDescription);
if (shouldUpdate && newUnitDescription != "?")
mappingDataFormat.MappedColumn.Dimension.HasUnit(mappingDataFormat.MappedColumn.Unit.SelectedUnit).ShouldBeTrue();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mappingDataFormat.MappedColumn : create a variable and use it. It's referenced 3 times. This is going to make the test easier to understand

@abdelr
Copy link
Member Author

abdelr commented Feb 23, 2022

@msevestre, @georgeDaskalakis ColumnInfos is now a Cache<string, ColumnInfo> and it is always created as new Cache<string, ColumnInfo>(getKey: : x => x.DisplayName). This way we do not write anymore _columnInfos.First(x => x.DisplayName == "Time") but _columnInfos["Time"]. Notice there are several files touched so please, take a look and merge as soon as possible if the change is approved so we avoid collisions later.

@msevestre
Copy link
Member

I meant in the presenter. The columnInfo otherwise could have stayed as is.

@@ -186,7 +186,7 @@ protected override void Context()
Name = "Time",
Unit = new UnitDescription("s")
},
ColumnInfo = _columnInfos[0]
ColumnInfo = _columnInfos.ElementAt(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those indexes here should use keys instead. Also what is the displaName of the column info defined on line 175. It it name ? You are using it as key and it's not defined

@@ -169,7 +169,7 @@ protected static MetaDataCategory createMetaDataCategory<T>(string descriptiveNa

public class When_setting_settings : concern_for_ImporterPresenter
{
protected IReadOnlyList<ColumnInfo> _columnInfos = A.Fake<IReadOnlyList<ColumnInfo>>();
protected Cache<string, ColumnInfo> _columnInfos = A.Fake<Cache<string, ColumnInfo>>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a fake here? This does not make sense.

{
var columns = new List<ColumnInfo>();
var columns = new Cache<string, ColumnInfo>(getKey: x => x.DisplayName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have name and display name in columnInfo. I am getting a bit confused

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are not both needed since we always have them assigned to the same value. So we could create an issue to remove one of them (DisplayName I would say)

@@ -709,7 +709,7 @@
// base.GlobalContext();
//
// var dataRepositoryMapper = A.Fake<IImportDataTableToDataRepositoryMapper>();
// var columnInfos = A.Fake<IReadOnlyList<ColumnInfo>>();
// var columnInfos = A.Fake<Cache<string, ColumnInfo>>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are those tests commented out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test file was left, removing it.

@msevestre msevestre merged commit 1469a20 into develop Feb 23, 2022
@msevestre msevestre deleted the 1427_dimension_must_also_be_inferred_from_headers branch February 23, 2022 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Importer-Redesign Redesign of observed data import
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Importer should try to parse dimension/unit from header after selecting a column for "Measurement"
2 participants