Add Excel Import column mapping & import field type handling

Estrusco edited this page Nov 24, 2016 · 4 revisions
Clone this wiki locally

I've created a remake of the current Excel Import from the Basic Samples (here: https://github.com/volkanceylan/Serene/tree/master/Serene/Serene.Web/Modules/BasicSamples/Grids/ProductExcelImport)

I've essentially taken the existing setup and re-wrote it to achieve the following:

  • One single xyzEndpoint.cs file to manage all importable columns for the table
  • Re-usable entry manager to ensure imported data is managed appropriately
  • Automatically add non-existent joint fields
  • Auto-map fields by ID and Header to match against the system table, ensuring proper update of proper table field regardless of extra columns, missing columns or columns being out of their regular order.
  • Make as much of the code reusable in a ExcelImportHelper.cs file, so as to make it easier to create other import files.
  • refresh grid and cache after import to display the newly entered data with any newly created fields

NOTE: Currently, if you use an exported file, the date field may be in the wrong format as it exports as DD/MM/YY but expects the import to be in MM/DD/YY. You can simply set the format on the excel spreadsheet as short date before import or add a handler.


First we have a myExcelImportHelper.cs file I created that manages most of the repetitive tasks we will be using. See the in code comments for details on what each section does. I've also added in the commentary how to handle custom entry types such as enum's or other custom entries that cant be explicitly casted.

namespace myProject.myImportHelper
{
    using OfficeOpenXml;    
    using Serenity.Data;    
    using System;
    using System.Collections.Generic;    
    using System.Linq;

    public class ExcelImportHelper
    {
        public static int IndexOfHeader(string[] arr, string value)
        {
            //IndexofHeader gets index location of the field within the expected header
            //array. This way the order of the elements in the expected headers 
            //array is irrelevant. 

            for (int i = 0; i < arr.Length; i++)
            {
                if (arr[i] == value)
                {
                    return i;
                }
            }
            return -1;
        }
        public static int ColumnIndex(string reference)
        {
            //takes a Alphabetic column name and translates it to a column index
            int ci = 0;
            reference = reference.ToUpper();
            for (int ix = 0; ix < reference.Length && reference[ix] >= 'A'; ix++)
                ci = (ci * 26) + ((int)reference[ix] - 64);
            return ci;
        }

        public static int[] myExcelHeaderMap(ExcelRange headers, string[] expectedHeaders, List<string> myErrors, int lastCol)
        {
            //Takes the existing list of headers from the system (expectedHeaders) and 
            //compares it against the headers list from the data imported. When it finds
            //a match it takes the Letter Address from the import file and makes it into 
            //a digit value specific to the column index. It then saves the new index and 
            //returns a array with -1 for not-necessary columns. 
            string[] headerArray = new string[lastCol]; //Array index for imported headers. 
            var i = 0;
            string eMessage = null;
            int[] myCols = new int[expectedHeaders.Length];

            foreach (var firstRowCell in headers) //Reads the import header range and places them into an array
            {
                headerArray[i] = firstRowCell.Text; 
                i++;
            }

            for (int counter = 0; counter < expectedHeaders.Length; counter++) 
            {
                for (var x = 0; x < headerArray.Length; x++) //comapres each headerArray value to the existing system headers
                {
                    var wsHeader = headerArray[x];
                    var currEHeader = expectedHeaders[counter]; 
                    //if the header exists it reads the address of the column. 
                    if (wsHeader == currEHeader) 
                    {
                        var g = headers.First(h => h.Value.Equals(currEHeader)); 
                        var alphaAddress = g.Address[0].ToString(); 
                        if (x >= 26) //if larger than 'Z'
                        {
                            alphaAddress = g.Address[0].ToString() + g.Address[1].ToString();
                            if (x >= 702) //if larger than 'ZZ'
                            {
                                alphaAddress = g.Address[0].ToString() + g.Address[1].ToString() + g.Address[2].ToString();
                            }
                        }

                        //the ColumnIndexer changes the Alpha address to a digit. 
                        myCols[counter] = ColumnIndex(alphaAddress); 
                        eMessage = null;
                        break;
                    }
                    else 
                    {
                        //If no imported columns matched, it lets the user know that the column was missing. 
                        myCols[counter] = -1;
                        eMessage = "Column " + expectedHeaders[counter] + " appears to be missing from the file"; 
                    }
                }
                if (eMessage != null)
                    myErrors.Add(eMessage);
            }
            return myCols; 
        }

        public enum entryType
        {
            //Enum managing the myImportEntry field type. You can add to this enum to trigger 
            //the handling of any specific or custom field type you wil be importing

            String = 1,
            Decimal = 2,
            Int16 = 3,
            Bool = 4
           //custom Entry for custom sample: 
           /*            
            hertz = 5             

            */
        }

        public static object myImportEntry(object myVal, object myField, List<string> myErrors, string expectedHeader, int row,  entryType myType)
        {
            //We take the entry values and we identify what the type should be using the entryType Enum. 
            //Depending on the entryType, we then use explicit casting to handle the value or we define our own handling. 
            /*

             Example of custom handling: 
             //converting entry into the proper enum entry type for a Hertz Enum. My hertz enum is defined as : 

            [EnumKey("myProject.Hertz")]
               public enum Hertz
               {
                [Description("60")]
                Sixty = 1,
                [Description("50")]
                Fifty = 2,
               }

            I handle it as such: 

            //....//

              case entryType.hertz:
                            {
                                var val1 = Convert.ToString(val);
                                if (val1 == "50")
                                    myField = Hertz.Fifty;
                                else if (val1 == "60")
                                    myField = Hertz.Sixty;
                                else
                                {
                                    myField = Hertz.Sixty;
                                    myErrors.Add(" Added Hertz field as default value");
                                }
                                break;
                            };
            //...// 



             */
            if (myVal != null)
            {
                try
                {
                    dynamic val = myVal;
                    switch (myType)
                    {
                        case entryType.String:
                            myField = Convert.ToString(val);
                            break;
                        case entryType.Decimal:
                            myField = (decimal)val;
                            break;
                        case entryType.Int16:
                            myField = (Int16)val;
                            break;
                        case entryType.Bool:
                            myField = (bool)val;
                            break;
                        default:
                            myField = null;
                            break;
                    }
                    return myField; //returns handled value
                }
                catch (Exception e)
                {
                    myErrors.Add(expectedHeader + " on row " + row + "had a Exception : " + e.Message);
                    return myField = null; //gives a null value and passes error message
                }
            }
            return myField;
        }

        public static object[] myExcelRowValues(int eLength, int row, int[] myCols, ExcelWorksheet worksheet)
        {
            //here we read the value of the given row and create a object array to 
            //store the values. This array is then passed back to be handled by the 
            //import xyzEndpont.cs file. 
            object[] myVal = new object[eLength];
            for (var f = 0; f < eLength; f++)
            {
                try
                {
                    if (myCols[f] > -1)
                        myVal[f] = worksheet.Cells[row, myCols[f]].Value;
                }
                catch
                {
                    myVal[f] = null;
                }
            }
            return myVal;
        }
    }
}

Next, lets create the xyzDialog.ts. Here you have the same file from the basic sample but we add "Q.reloadLookup("xyzLookupscript")" for each lookup script that will be needing to be refreshed on the cache during import. You'll want to create one of these for each table you will be importing, updating the reloadLookup command according to each table.

namespace myProject.BasicSamples {

    @Serenity.Decorators.registerClass()
    export class ProductExcelImportDialog extends Serenity.PropertyDialog<any, any> {

        private form: ProductExcelImportForm;

        constructor() {
            super();

            this.form = new ProductExcelImportForm(this.idPrefix);
        }

        protected getDialogTitle(): string {
            return "Excel Import";
        }

        protected getDialogButtons(): Serenity.DialogButton[] {
            return [
                {
                    text: 'Import',
                    click: () => {
                        if (!this.validateBeforeSave())
                            return;

                        if (this.form.FileName.value == null ||
                            Q.isEmptyOrNull(this.form.FileName.value.Filename)) {
                            Q.notifyError("Please select a file!");
                            return;
                        }

                        ProductExcelImportService.ExcelImport({
                            FileName: this.form.FileName.value.Filename
                        }, response => {
                            Q.reloadLookup("Northwind.Supplier"); //<---add the lookup being used that need refresh from cache. 
                            Q.reloadLookup("Northwind.Category");
                            Q.notifyInfo(
                                'Inserted: ' + (response.Inserted || 0) +
                                ', Updated: ' + (response.Updated || 0));

                            if (response.ErrorList != null && response.ErrorList.length > 0) {
                                Q.notifyError(response.ErrorList.join(',\r\n '));
                            }



                            this.dialogClose();
                        });
                    },
                },
                {
                    text: 'Cancel',
                    click: () => this.dialogClose()
                }
            ];
        }
    }
}

Lastly the xyzEndpoint.cs file would be specific to the table you intend to import to. You'd generate one of these for each table and then you'd handle the importing fields accordingly. You'll need to handle only the fields that would ever be expected to be imported. The endpoint file will handle the fields as either a standard entry, a joint field, or ignore it if it doesn't exist in the import file.

namespace myProject.BasicSamples.Endpoints
{
    using Northwind.Entities;
    using Northwind.Repositories;
    using OfficeOpenXml;
    using Serenity;
    using Serenity.Data;
    using Serenity.Services;
    using Serenity.Web;
    using System;
    using System.Collections.Generic;
    using System.Data;
    using System.IO;
    using System.Web.Mvc;
    using MyRow = Northwind.Entities.ProductRow;
    using SupRow = Northwind.Entities.SupplierRow; //<---Note I added SupRow so I can more easily edit the joint fields
    using CatRow = Northwind.Entities.CategoryRow; //<--Same as above
    using System.Linq;
    using eImp = myImportHelper.ExcelImportHelper;

    [RoutePrefix("Services/BasicSamples/ProductExcelImport"), Route("{action}")]
    [ConnectionKey("Northwind"), ServiceAuthorize]
    public class ProductExcelImportController : ServiceEndpoint
    {    

        [HttpPost]
        public ExcelImportResponse ExcelImport(IUnitOfWork uow, ExcelImportRequest request)
        {
            request.CheckNotNull();
            Check.NotNullOrWhiteSpace(request.FileName, "filename");
            UploadHelper.CheckFileNameSecurity(request.FileName);

            if (!request.FileName.StartsWith("temporary/"))
                throw new ArgumentOutOfRangeException("filename");

            ExcelPackage ep = new ExcelPackage();
            using (var fs = new FileStream(UploadHelper.DbFilePath(request.FileName), FileMode.Open, FileAccess.Read))
                ep.Load(fs);

            var response = new ExcelImportResponse();
            response.ErrorList = new List<string>();
            var myErrors = response.ErrorList;
            string errorMessage;

            //The following code reads the first sheet in the Excel worksheet and  
            //identifies the dimensions of the range. It them takes a manually generated 
            //list of the expected headers. The expected headrs list works as the Index for 
            //matching the identified columns being imported to the proper field in the system.
            //Columns are '0' indexed. 
            //Example: Since supplier is in the third array field of the expected headers, 
            //the product.SupplierId field will need to reference colIndex number '3' so 
            //that it fills into the correct field in the system. 

            var worksheet = ep.Workbook.Worksheets.First(); //grabs first sheet from worksheet file for import
            var lastCol = worksheet.Dimension.End.Column; //Last column with data in worksheet
            var lastRow = worksheet.Dimension.End.Row; //last row with data in worksheet
            var firstCol = worksheet.Dimension.Start.Column; //first column with data in worksheet
            var firstRow = worksheet.Dimension.Start.Row; //first row with data in workesheet. 
            var headers = worksheet.Cells[firstRow, firstCol, 1, lastCol]; //range containing headers

            //I get the displayed titles for each column currently existing in the system. 
            //The onnly exception is the Id Field, which if the title is taken from the field, 
            //it would give the wrong title since the one we actually would want would be from 
            //the IIdField which I don't know how to get. (When you try to pull the myFields.ProductId.Title 
            //you get "Product Id" which is the display name for the field but the field name is replaced for 
            //the ID field.) I pass these to an array holding my expected headers when a file is imported. 

            //Side Note: I instantiante the field titles as I do because I find it, personally, more readable. 
            //You can always pass them directly onto the array

            var myFields = MyRow.Fields;
            string Id = "ID";
            string ProdName = myFields.ProductName.Title; 
            string Disc = myFields.Discontinued.Title;
            string Supplier = myFields.SupplierCompanyName.Title;
            string Category = myFields.CategoryName.Title;
            string QPU = myFields.QuantityPerUnit.Title;
            string UP = myFields.UnitPrice.Title; 
            string UIS = myFields.UnitsInStock.Title; 
            string UOO = myFields.UnitsOnOrder.Title;
            string RL = myFields.ReorderLevel.Title;      

            string[] expectedHeaders = { Id, ProdName, Disc, Supplier, Category, QPU, UP, UIS, UOO, RL }; 

            //I use the expected headers to compare against the headers on the imported file getting the 
            //index location in the imported file for the matching headers only. 

            int[] myCols = eImp.myExcelHeaderMap(headers, expectedHeaders, myErrors, lastCol); //matches headers between worksheet and system. 

            //A few variables  to make my life easier//

            var myConnection = uow.Connection;
            dynamic obj; //for capturing value in proper type 
            int m; //System Column Index


            for (var row = 2; row <= lastRow; row++)
            {
                //The myExcelRowValues takes the values in each cell of the current import file row and puts it 
                //into a object array matching the order of the system column index. 
                object[] myVal = eImp.myExcelRowValues(expectedHeaders.Length, row, myCols, worksheet);

                try
                {

                    //This first instance with the ID field checks the key for the row if it exists or not. 
                    //if the row ID exists, it will use it to update the row with the imported fields, if
                    //it does not exist, it created a new instance. 

                    //eImp.IndexOfHeader gets index location of the field within the expected header
                    //array. This way the order of the elements in the expected headers 
                    //array is irrelevant. 
                    int keyCol = eImp.IndexOfHeader(expectedHeaders, Id); 

                    var sysKey = myFields.ProductID;
                    var wsKeyField = Convert.ToInt32(myVal[keyCol]);
                    var product = myConnection.TryFirst<MyRow>(q => q.Select(sysKey).Where(sysKey == wsKeyField));


                    if (product == null)
                        product = new MyRow(); //If the product doesn't exist already, it creates a new one. 
                    else
                        product.TrackWithChecks = false;     // avoid assignment errors


                    //Once past the row ID we identify the type of field to be updated and 
                    //manage the update accordingly. We start by checking if there was a column
                    //value present for that field. If not, it skips editing that field and 
                    //moves on. We then identify the field type(String, joint, int, enum, etc)
                    //and choose what to do accordingly. 

                    //-----------------Joint Field: Product.SupplierId <-> SupplierRow.CustomerName-----------------------------//
                    m = eImp.IndexOfHeader(expectedHeaders, Supplier); //See eImp.IndexOfHeader definition for more details
                    if (myVal[m] != null) 
                    {

                        //In the event of a joint field, you identify the ID field for the joint table and select the fields to be compared. 

                        var g = SupRow.Fields;                         
                        var sysID = g.SupplierID; //Id on joint Row Field
                        var sysField = g.CompanyName; //field being compared against
                        var wsField = Convert.ToString(myVal[m]); //field to compare from import file
                        var currRow = myConnection.TryFirst<SupRow>(q => q.Select(sysID).Where(sysField == wsField)); //checks if field exists

                        //If field does not exist, creates an entry within the joint table and 
                        //assigns the new ID to the system field. It also lets the user know. 

                        if (currRow == null) 
                        {
                            try
                            {
                                var newRow = myConnection.InsertAndGetID(new SupRow { CompanyName = wsField }); //create new entry
                                //var newRow = myConnection.TryFirst<SupRow>(q => q.Select(sysID).Where(sysField == wsField)); //get the new ID
                                //var newRow = new SupRow { CompanyName = wsField };
                                product.SupplierID = Convert.ToInt32(newRow);
                                //product.SupplierID = newRow.SupplierID; //assign ID of new joint row to main table field
                                errorMessage = "Warning On Row " + row + ": Entity with name '" + wsField + "' was not found so it was added!"; //let the user know
                                myErrors.Add(errorMessage);
                            }
                            catch (IOException e)
                            {
                                myErrors.Add("IOException for " + expectedHeaders[m] + " Field for row " + row + " : " + e.Message);

                            }
                        }
                        else
                        {
                            //as a backup, it may assign a default value to the field
                            product.SupplierID = currRow.SupplierID; 
                        }
                    }

                    //-----------------Joint Field: Product.CategoryId <-> CategoryRow.CategoryName-----------------------------//


                    m = eImp.IndexOfHeader(expectedHeaders, Category);
                    if (myVal[m] != null)//only edits if needed. When editing, it recognized and manages the field as a joint field
                    {
                        var g = CatRow.Fields;
                        int myCol = m; 
                        var sysID = g.CategoryID;
                        var sysField = g.CategoryName;
                        var wsField = Convert.ToString(myVal[myCol]);
                        var currRow = myConnection.TryFirst<CatRow>(q => q.Select(sysID).Where(sysField == wsField));

                        if (currRow == null)
                        {
                            try
                            {
                                var newRow = myConnection.InsertAndGetID(new CatRow { CategoryName = wsField });                                                           
                                //var newRow = myConnection.TryFirst<CatRow>(q => q.Select(sysID).Where(sysField == wsField));
                               // var newRow = new CatRow { CategoryName =  wsField };
                                product.CategoryID = Convert.ToInt32(newRow);
                                errorMessage = "Warning On Row " + row + ": Entity with name '" + wsField + "' was not found so it was added!"; //let the user know
                                myErrors.Add(errorMessage);
                            }
                            catch (IOException e)
                            {
                                myErrors.Add("IOException for " + expectedHeaders[m] + " Field for row " + row + " : " + e.Message);

                            }
                        }
                        else
                        {                            
                            product.CategoryID = currRow.CategoryID; //uses default category 
                        }
                    }

                    //--------------------------------------------------------------------------------------//

                    //Much like in the joint field, it first checks whether the field 
                    //needs to be edited or not. If not, it skips and moves on to the 
                    //next field. If it does, it runs the Import Entry to properly 
                    //cast the incoming value into the proper format type. 
                    //You simply copy this section, updating the IndexOfHeader with 
                    //the proper field title, the myImportEntry current field value 
                    //and item type, and the field reference to be updated. 
                    //
                    //See: myImportEntry definition for more detail on it. 
                    //
                    //Example:
                    //
                    //m = eImp.IndexOfHeader(expectedHeaders, <FieldTitle>);
                    //obj = eImp.myImportEntry(myVal[m], <currentFieldValue>, myErrors, expectedHeaders[m], row, <FieldTypeEnum>);
                    //if (obj != null)//only edits if needed
                    //{
                    //    <FieldToUpdate> = obj;//performs entry using proper-cast value
                    //    obj = null;//clears field for next entry
                    //}

                    m = eImp.IndexOfHeader(expectedHeaders, ProdName);//locates proper index location of system field
                    obj = eImp.myImportEntry(myVal[m], product.ProductName, myErrors, expectedHeaders[m], row, eImp.entryType.String);
                    if (obj != null)//only edits if needed
                    {
                        product.ProductName = obj;//performs entry using proper-cast value
                        obj = null;//clears field for next entry
                    }

                    m = eImp.IndexOfHeader(expectedHeaders, Disc);
                    obj = eImp.myImportEntry(myVal[m], product.Discontinued, myErrors, expectedHeaders[m], row, eImp.entryType.Bool);
                    if (obj != null)//only edits if needed
                    {
                        product.Discontinued = obj;//performs edit
                        obj = null;//clears field for next entry
                    }

                    m = eImp.IndexOfHeader(expectedHeaders, QPU);
                    obj = eImp.myImportEntry(myVal[m], product.QuantityPerUnit, myErrors, expectedHeaders[m], row, eImp.entryType.String);
                    if (obj != null)//only edits if needed
                    {
                        product.QuantityPerUnit = obj;//performs edit
                        obj = null;//clears field for next entry
                    }

                    m = eImp.IndexOfHeader(expectedHeaders, UP);
                    obj = eImp.myImportEntry(myVal[m], product.UnitPrice, myErrors, expectedHeaders[m], row, eImp.entryType.Decimal);
                    if (obj != null)//only edits if needed
                    {
                        product.UnitPrice = obj;//performs edit
                        obj = null;//clears field for next entry
                    }

                    m = eImp.IndexOfHeader(expectedHeaders, UIS);
                    obj = eImp.myImportEntry(myVal[m], product.UnitsInStock, myErrors, expectedHeaders[m], row, eImp.entryType.Int16);
                    if (obj != null)//only edits if needed
                    {
                        product.UnitsInStock = obj;//performs edit
                        obj = null;//clears field for next entry
                    }

                    m = eImp.IndexOfHeader(expectedHeaders, UOO);
                    obj = eImp.myImportEntry(myVal[m], product.UnitsOnOrder, myErrors, expectedHeaders[m], row, eImp.entryType.Int16);
                    if (obj != null)//only edits if needed
                    {
                        product.UnitsOnOrder = obj;//performs edit
                        obj = null;//clears field for next entry
                    }           

                    m = eImp.IndexOfHeader(expectedHeaders, RL);
                    obj = eImp.myImportEntry(myVal[m], product.ReorderLevel, myErrors, expectedHeaders[m], row, eImp.entryType.Int16);
                    if (obj != null)//only edits if needed
                    {
                        product.ReorderLevel = obj; //performs edit
                        obj = null; //clears field for next entry
                    }

                    //If the row was newly created, you count it and list it in this response
                    if (product.ProductID == null)
                    {
                        new ProductRepository().Create(uow, new SaveWithLocalizationRequest<MyRow>
                        {
                            Entity = product
                        });
                        response.Inserted = response.Inserted + 1;

                    }

                    //If the row existed and was updated, you count it and list it in this response
                    else
                    {
                        new ProductRepository().Update(uow, new SaveWithLocalizationRequest<MyRow>
                        {
                            Entity = product,
                            EntityId = product.ProductID.Value
                        });
                        response.Updated = response.Updated + 1;
                    }
                }
                catch (Exception ex)
                {
                    myErrors.Add("Exception on Row " + row + ": " + ex.Message);
                }
            }

            return response;
        }        
    }
}

With those three files in place, you can now create the excel import files by simply recreating for each needed table:

  • the dialog file ->updating the reloadlookups
  • the endpoint file -> updating which fields will be updated upon import.

Take note, this currently lets you have full control of which fields you want to import by matching the headers from the imported file to what the display name values for the fields are. As long as the imported file header matches the title of the field, it will assume it is the correct field. Be wary of duplicate headers. Additionally, I'm defining the 'Id' field as the key definer of whether an entry exists or not. You can always change that to whatever unique field you wish to use to define your entry.


I'm pretty certain that this can be simplified further so I welcome any commentary that may help best improve on this. I'd also like to note that I purposely set the code to read the data header and row in a separate instance so that it can be replaced with any other form of data table other than excel. I'll leave that part to those that wish to explore on it.

See also: Adding Excel Export