Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up execution of mat2py #5

Closed
hcommin opened this issue Sep 14, 2023 · 3 comments
Closed

Speed up execution of mat2py #5

hcommin opened this issue Sep 14, 2023 · 3 comments

Comments

@hcommin
Copy link

hcommin commented Sep 14, 2023

Similar to #4, the execution time of mat2py is dominated by Python library imports. For example, I run this simple test script:

x_mat = randn(32768, 1);

tic;
for i = 1:1024
    x_py = mat2py(x_mat);
end
toc

And execution takes roughly 8.5 seconds:

Elapsed time is 8.511265 seconds.

Similar to #4, it seems like we can delete these lines:

    Im = @py.importlib.import_module;    
    np = Im('numpy');
    sp = Im('scipy.sparse');
    dt = Im('datetime');
    tz = Im('dateutil.tz');

Then we need to:

  • Replace all instances of np with py.numpy.
  • Replace sp with py.scipy.sparse.
  • Replace dt with py.datetime.
  • Replace tz with py.dateutil.tz.

After that, my simple (NumPy-only) example executes about 200x faster:

Elapsed time is 0.040555 seconds.

I quickly tested the sp, dt and tz changes like this:

% Test sp
mat2py(sparse(magic(3)))

% Test dt
date_time = datetime('now');
mat2py(date_time)

% Test tz
date_time.TimeZone = 'Europe/Zurich';
mat2py(date_time)

And they appear to be working correctly.

@AlDanial
Copy link
Collaborator

@hcommin : wow! Thanks for #3, #4, and this enhancement. I will have time to update the repo this weekend. Alternatively, if you'd like commit credit, I'll gladly take your pull requests.

@hcommin
Copy link
Author

hcommin commented Sep 15, 2023

@AlDanial Thank you for this highly useful code. Coming from a MATLAB background (with only ~3 years of intermittent exposure to Python), I have been wrestling with the MATLAB-Python interface for a long time. Your functions just behave exactly how I want them to.

I don't need commit credit. But it may be helpful if I share my code so you can diff and modify. I will paste py2mat.m in #4, and here is mat2py.m:

% Convert a MATLAB variable to an equivalent Python-native variable.
% py_var = mat2py(mat_var);
% py_var = mat2py(mat_var, 'bytes');  % char mapped to Python bytes
% py_var = mat2py(mat_var, 'string'); % char mapped to Python string
function [x_py] = mat2py(x_mat, char_to)
    arguments
        x_mat
        char_to = 'string';
    end

% {{{ code/matlab_py/mat2py.m
% This code accompanies the book _Python for MATLAB Development:
% Extend MATLAB with 300,000+ Modules from the Python Package Index_ 
% ISBN 978-1-4842-7222-0 | ISBN 978-1-4842-7223-7 (eBook)
% DOI 10.1007/978-1-4842-7223-7
% https://github.com/Apress/python-for-matlab-development
% 
% Copyright © 2022 Albert Danial
% 
% MIT License:
% Permission is hereby granted, free of charge, to any person obtaining a copy
% of this software and associated documentation files (the "Software"), to deal
% in the Software without restriction, including without limitation the rights
% to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
% copies of the Software, and to permit persons to whom the Software is
% furnished to do so, subject to the following conditions:
% 
% The above copyright notice and this permission notice shall be included in
% all copies or substantial portions of the Software.
% 
% THE SOFTWARE IS PROVIDEDAS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
% IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
% FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
% THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
% LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
% FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
% DEALINGS IN THE SOFTWARE.
% }}}
    
    x_py = py.numpy.array({});
    switch class(x_mat)
        case 'char'
            if strcmp(char_to,'bytes')
                x_py = py.bytes(x_mat,'ASCII');
            else
                x_py = py.str(x_mat);
            end
        case 'string'
            x_py = py.str(x_mat);
        case 'datetime'
            int_sec = int64(floor(x_mat.Second));
            frac_sec = x_mat.Second - double(int_sec);
            micro_sec = int64(round(1e6 * frac_sec));
            if ~isempty(x_mat.TimeZone)
                tzinfo = py.dateutil.tz.gettz(x_mat.TimeZone);
            else
                tzinfo = py.None;
            end
            x_py = py.datetime.datetime(int64(x_mat.Year), int64(x_mat.Month), ...
                               int64(x_mat.Day) , int64(x_mat.Hour) , ...
                               int64(x_mat.Minute), int64(x_mat.Second), ...
                               micro_sec, tzinfo);
        case {'double', 'single', ...
              'uint8', 'uint16', 'uint32', 'uint64', ...
              'int8',  'int16',  'int32',  'int64'}
            if issparse(x_mat)
                if ndims(x_mat) ~= 2
                    fprintf('mat2py:  can only convert 2D sparse matrices\n')
                    return
                end
                [nR,nC] = size(x_mat);
                [i,j,vals] = find(x_mat);
                % subtract 1 to go from 1-based to 0-based indices
                py_I    = py.numpy.array(int64(i)-1);
                py_J    = py.numpy.array(int64(j)-1);
                py_vals = mat2py(vals);
                py_dims = py.tuple({int64(nR), int64(nC)});
                py_IJ   = py.tuple({py_I, py_J});
                V_IJ    = py.tuple({py_vals, py_IJ});
                x_py = py.scipy.sparse.coo_matrix(V_IJ,py_dims);
            elseif ismatrix(x_mat)
                if numel(x_mat) == 1
                    x_py = x_mat;  % scalar numeric value
                elseif isreal(x_mat)
                    x_py = py.numpy.array(x_mat);
                else
                    x_py = py.numpy.array(real(x_mat)) + 1j*py.numpy.array(imag(x_mat));
                end
            end
        case 'logical'
            if x_mat
                x_py = py.True;
            else
                x_py = py.False;
            end
        case 'cell'
            x_py = py.list();
            dims = size(x_mat);
            if prod(dims) == max(size(x_mat))
                % 1D cell array
                for i = 1:numel(x_mat)
                    x_py.append(mat2py(x_mat{i}, char_to));
                end
            else
                if length(dims) > 2
                    fprintf('mat2py:  %d-dimensional cell array conversion ' + ...
                            'is not implemented\n', length(dims));
                    return
                end
                nR = dims(1,1); nC = dims(1,2);
                for r = 1:nR
                    this_row = py.list();
                    for c = 1:nC
                      this_row.append(mat2py(x_mat{r,c}, char_to));
                    end
                    x_py.append(this_row);
                end
            end
        case 'struct'
            x_py = py.dict();
            F = fieldnames(x_mat);
            if (length(x_mat) > 1) && ...
               (class(x_mat) == "struct")
                % struct array
                x_py = py.list();
                for j = 1:length(x_mat)
                    x_py.append( mat2py(x_mat(j)) );
                end
            else
                for i = 1:length(F)
                    if (length(x_mat.(F{i})) > 1) && ...
                       (class(x_mat.(F{i})) == "struct")
                        % struct of struct array
                        List = py.list();
                        for j = 1:length(x_mat.(F{i}))
                            List.append( mat2py(x_mat.(F{i})(j)) );
                        end
                        x_py.update(pyargs(F{i}, List));
                    else
                        x_py.update(pyargs(F{i}, mat2py(x_mat.(F{i}))));
                    end
                end
            end
        otherwise
            fprintf('mat2py:  %s conversion is not implemented\n', class(x_mat))
    end % switch
end

AlDanial added a commit that referenced this issue Sep 15, 2023
support bool type in py2mat, #3

fix microsecond/millisecond conversion error in datetime

add conversion and performance tests for mat2py and py2mat
@AlDanial
Copy link
Collaborator

@hcommin your contributions were fantastic, thank you! I plan to post about the improvements on "Matlab Central", that is https://www.mathworks.com/matlabcentral/answers/?category=matlab%2Findex&sort=updated+desc&term=matlab+python
I'll cite "https://github.com/hcommin" as the source of improvements in that post unless you'd prefer something else.
Also: the mat2py and py2mat functions had errors converting fractional seconds. MATLAB uses milliseconds, Python's datetime uses microseconds. All along I thought MATLAB's documentation for "MS" was microseconds but was actually milliseconds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants